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Transmission performance of the telephone message network has im- 
proved steadily over the years. Coincidental with this improvement has 
been the evolution of rating plans which provide the basis for transmission 
planning, design, and evaluation. Loudness of telephone speech was an 
important consideration in this evolution. 

Telephone speech loudness continues to be one of the major factors which 
need to be taken into account in telephone transmission engineering. This 
paper covers a laboratory system, called EARS (Electro-Acoustic Hating 
System), devised to make objective measurements of partial and overall 
telephone connections (including electro-acoustic transducer efficiencies) in a 
manner ivhich reflects subjective loudness loss. Topics covered include a 
historical review of rating plans, computation of speech loudness, evolution 
of the EARS, and description of the system and its capabilities. 

The EARS essentially comprises a sound source and a meter for meas- 
uring either acoustical pressure or electrical voltage. Design of the source 
and meter are based on telephone speech loudness considerations, and 
measurements made with the system approximate subjective loudness 
judgments with an accuracy which is sufficient for telephone engineering 
purposes. 

The EARS may be used in implementing any telephone transmission 
rating plan incorporating speech loudness loss as an element. In such 
application, the system can be used for specifying connection losses, thus 
eliminating the need for extensive subjective tests of loudness. However, 
subjective tests will still be required to evaluate effects of other elements, 
e.g., noise, important in any given plan, and to evaluate the range of loud- 
ness losses acceptable for that plan. 

I. INTRODUCTION 

A basic Bell System objective — to provide our customers with the 
best possible telephone message transmission consistent with the 
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state of the art and the economic climate — has remained essentially 
unchanged since the early days of telephony. However, continuing 
review of the telephone system in terms of this objective has resulted 
in steady improvement in transmission performance of the system 
over the years. Such improvement has been made possible by growth 
in our technical skills; it has been made necessary by evolving 
customer needs for improved transmission. Indeed, it has been postu- 
lated that as our customers use the telephone, they become ac- 
customed to current performance and come to expect further im- 
provement. 1,2 

What do we mean by telephone message transmission performance? 
In the broadest sense, this refers to the effect of the system on speech 
signals when these signals are transmitted over telephone connections. 
Customers conversing over telephone connections want to hear reason- 
ably faithful, undistorted reproductions of each others' voices with 
a minimum of effort. Connections for which these conditions pertain 
can be thought of as providing satisfactory transmission performance. 
Connections exhibiting severe distortion would thus provide some- 
thing less than satisfactory performance; customers might be able to 
converse but only with extreme difficulty. 

Speech transmission capabilities of the telephone network are often 
considered in terms of individual transmission parameters, the com- 
bination of which determines overall transmission performance. Some 
of the more important parameters are loss, amplitude distortion, and 
unwanted interferences such as noise, crosstalk, and echo. Improve- 
ment in performance over the years has been achieved by design to 
control these parameters, singly and in combination, as dictated by 
the technology, economics, and needs of the times. 

Coincidental with this improvement in transmission performance 
has been the evolution of telephone transmission rating. The problem 
of rating, that is evaluation and measurement, has been the subject of 
much thought and work over many years, and a number of different 
rating plans evolved to meet the transmission design needs for an 
improving and expanding telephone message network. 

For present purposes, a transmission rating plan comprises (i) a 
rating criterion which represents the basis for rating telephone con- 
nections, (ii) a reference system (may be a physical simulation of an 
overall telephone connection, a set of definitions, or both) with some 
adjustable feature in terms of which connection ratings are estab- 
lished based on the selected criterion, and (in) a rating scale which 
is essentially represented by the variable feature of the reference 
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system. Thus, a rating plan provides a framework for the design and 
evaluation of telephone connections. 

Historically, the rating criterion was subjective in nature, and ex- 
tensive subjective testing was required to evaluate telephone connec- 
tions in terms of the reference. Moreover, application of the various 
rating plans required subjective testing to evaluate scale values, i.e., 
determine what scale values (and connections) represented acceptable 
performance. 

Transmission performance of the telephone message network has 
improved to a point that loudness loss is a major variable which has 
to be taken into account in transmission planning. This paper de- 
scribes a laboratory rating system, called the EARS (Electro-Acoustic 
bating /System), which can be used to objectively measure loudness 
loss of telephone connections in a manner which closely approximates 
subjective loudness judgments. Thus, the system supplants subjective 
tests which would ordinarily be required to determine loudness ratings 
of connections. However, tests will still be needed both to determine 
subjective reaction to various amounts of loudness loss, and to deter- 
mine interrelationships between loudness loss and other transmission 
parameters important in any given rating plan. 

The EARS was devised to measure acoustic pressures and electric 
voltage as required by loudness rating definitions for partial and 
overall telephone connections/' The EARS has been used extensively 
over the past several years to characterize the loudness performance 
of telephone sets and connections, and to evaluate design plans such 
as unigauge design of the customer loop plant. 4 Moreover, loudness 
loss as measured with the EARS is an important element in current 
studies of transmission planning based on a multiparameter approach 
in which other transmission factors, e.g., noise and echo, are included 
with loudness loss. 8 Also, the EARS concept is one of several candi- 
dates for adoption as a standard method of specifying loudness loss.* 

The EARS essentially comprises a sound source which is used to 
energize the talking end of a telephone connection and an indicating 
meter which is used to measure acoustic pressure or electrical voltage, 
and provides a simple means of measuring input and output signal 
levels for partial and overall telephone connections. Loudness losses 



* Methods of measuring the loudness loss of telephone connections are cur- 
rently being studied by (i) Study Group XII of the CCITT (Comite Con- 
sultatif International Telegraphique et Telephonique — International Telegraph 
and Telephone Consultative Committee) as outlined in Question 15/XII of Ref . 
6 and (it) the Task Force on Telephone Instrument Testing of the Institute of 
Electrical and Electronics Engineers. 
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of connections and connection components, including electro-acoustic 
transducers, i.e., telephone transmitters and receivers, are then the 
differences between input and output signal levels expressed in dB-like 
terms relative to appropriate reference signal levels. 

The approach followed in this paper is first to consider the history 
of transmission rating plans and, second, to discuss the evolution of 
the EARS. Discussion of rating plans in Section II demonstrates the 
interrelationships of rating plans, rating systems, and network trans- 
mission performance improvements. Concluding remarks outline the 
relation between loudness rating definitions and the rating system, 
the EARS, which is used to determine parameter values as required 
by the rating definitions. Also mentioned are current studies con- 
cerned with multiparameter network design and evaluation. 

Discussion of the evolution of the EARS involves several steps. We 
begin in Section III by addressing ourselves to a review of various 
techniques for computing the loudness of tones, noise, and speech, then 
discuss in some detail the derivation of a particular speech loudness 
computation method. This discussion indicates the manner in which 
frequency response characteristics of partial and overall connections 
can be measured and shows how, from the measured response for a 
given connection, we can compute a number which we will call loud- 
ness. Also included is a comparison of computed and experimental 
results. 

Section IV covers derivation of the EARS from the speech loudness 
computation method referred to above and describes the EARS in its 
present form. Also described is a graphical computation method based 
on the design concepts leading to the EARS. 

In Section V, we discuss the accuracy of the EARS in predicting 
subjective test results and consider the effects of simplifying assump- 
tions employed in deriving the EARS. The vehicle for this is the 
graphical method referred to above. Results computed using this 
method are compared to observed results for the subjective tests used 
in validating the speech loudness computation method. This approach 
is necessary since the subjective test systems are no longer available, 
and hence could not be measured directly using the EARS. Therefore, 
the comparison of computed and observed results reflects the accuracy 
of the concepts on which the EARS is based, and not of the EARS 
itself. 

As will become evident, the EARS in its present form is only one 
of several ways in which the computational method could be imple- 
mented as a laboratory measuring system, and is not an exact realiza- 
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tion of the computational method. This form was selected because it 
provides a satisfactory combination of simple equipment arrange- 
ments and suitable calibration and measurement procedures. 

II. EVOLUTION OF TRANSMISSION RATING PLANS 

Historically, transmission rating plans depended on (i) the use of 
reference circuits with which commercial circuits could be compared 
and (ii) systems of units for expressing the relative ratings thus 
determined. A great deal of subjective testing was necessary to deter- 
mine ratings of commercial telephone equipments and facilities, and 
to correlate these ratings with measurable transmission properties. 
These ratings, in terms of the appropriate prevalent unit for express- 
ing them, were used in design and evaluation of the telephone message 
network. 

Considerable work was also necessary to determine transmission 
objectives for the network. That is, the reference system and related 
units provided the means of expressing transmission performance; it 
was then necessary to determine the rating, or range of ratings, to 
which the plant should be designed. 

2.1 Transmission Equivalent Plan 

For a time following the turn of the century, telephone circuits in 
commercial use had quite similar characteristics, and conditions were 
such that essentially only loudness, or volume, capability was under 
control of the transmission engineer. The performance of circuits was 
determined by comparing them, on a loudness basis, with a reference 
circuit which was adjustable in attenuation, but whose other trans- 
mission characteristics were typical of commercial circuits. 

The reference circuit in use at that time was known as the Standard 
Cable Reference System (SCRS). 7-0 It consisted of telephone sets 
(including transmitters and receivers), cord circuits (for supplying the 
telephone set carbon transmitters with operating current), and an 
adjustable artificial line for interconnecting the cord circuits. These 
connection components were representative of types then used com- 
mercially. 

The rating of telephone circuits by comparing them to the reference 
circuit involved a talker speaking alternately over the test circuit and 
the reference circuit, and a listener switching similarly at the receiving 
ends. The reference circuit artificial line, calibrated in terms of "miles 
of standard cable," was adjusted until the listener judged the volume 
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or loudness of the speech sounds reproduced by the two circuits to be 
equal. The number of miles of artificial line in the reference circuit 
was then used as the "transmission equivalent" of the circuit under 
test. 7 The effect of any change in a test circuit on the efficiency of 
that circuit could then be measured by determining the "transmission 
equivalent" before and after the change; the number of miles of 
artificial line required to compensate for the change was used as an 
index of this effect. 

Performance ratings assigned in terms of loss of speech volume or 
loudness based on the comparison procedure outlined above consti- 
tuted a {practicable and effective means of assessing transmission per- 
formance. The adverse effects of sidetone, distortion, and noise on 
transmission quality were recognized, but no way was known of 
incorporating them, together with loudness loss, into a single figure 
of merit for rating transmission performance. 3 Moreover, the simi- 
larity between the reference circuit and commercial circuits at that 
time rendered such incorporation unnecessary. 

As the state of the telephone art developed, modifications in the 
reference circuit became desirable in order for the circuit to more 
satisfactorily fulfill its purpose. Three factors leading to the modifi- 
cations were (i) improved designs of telephone instruments and cir- 
cuits, (ii) improved measuring techniques and instruments, and (Hi) 
the need for a more suitable unit than the "miles of standard cable." 

The effect of improved design was to introduce into the plant 
telephone instruments and circuits which had less distortion than 
corresponding parts of the Standard Cable Reference System. For 
this reason, it became desirable to have a new reference system with 
which transmission over the most perfect telephone circuit, or over 
circuits less perfect, could be simulated at will. 8,0 

Secondly, the performance of the Standard Cable Reference System 
was specified by stating the kinds of apparatus and circuits used. The 
electrical portion of the system could be checked by voltage, current, 
and impedance measurements. However, for the transmitters and 
receivers, reliance for constancy of performance was placed primarily 
upon the careful maintenance and frequent cross comparisons (sub- 
jective) of a group of transmitters and receivers which were specially 
constructed to reduce some of the sources of variation in the regular 
product instruments. 8,0 Improvements in measuring techniques and 
instruments made it possible to measure objectively characteristics 
of electro-acoustic transducers. This, in turn, permitted selection and 
maintenance of transducers to provide long-term stability. 
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Finally, a change in rating units became desirable for two reasons. 
First, circuits were being designed which had less distortion than the 
artificial cable of the Standard Cable Reference System. Since a mile 
of cable with this system corresponded not only to a certain volume 
change, but also to a distortion change, characterizing new circuits 
in terms of miles of standard cable implied a distortion degradation 
not necessarily attributable to the circuit under test. Secondly, two 
different reference systems were in common use, one in the United 
States, the other in some other countries. 10 These systems used 
artificial cable with different characteristics. That used in the United 
States had a loop resistance of 88 ohms and a capacitance of 0.054 
microfarad per mile; the other used artificial cable having the same 
resistance and capacitance but had, in addition, an inductance of 1 
millihenry and a conductance of 1 micromho per mile. Thus, a test 
circuit compared with each of these references would have two dif- 
ferent ratings assigned to it. The first of the above reasons suggested 
the need for a distortionless unit, the second for a common unit. 

A new unit, called the Transmission Unit (TU), was devised by 
the Bell System. 10 * This was a distortionless, logarithmic unit so 
chosen as to make use of common logarithms convenient in transmis- 
sion computations. Its magnitude was very nearly the same as the loss 
of a mile of standard cable and, thus, existing experience learned in 
terms of miles of standard cable could be transferred to the new 
system with a minimum of difficulty. The Transmission Unit later 
became the decibel (dB). 11 

The three factors discussed in preceding paragraphs resulted in 
design of the Master Reference System <MRS). R0 This system utilized 
transducers with very low distortion, amplifiers to compensate for the 
lower efficiency of these transducers as compared to commercial in- 
struments, and an artificial line (consisting essentially of a 600-ohm 
attenuator) for interconnecting the transmitting and receiving ele- 
ments. The system included provision for inserting distorting networks 
into the transmitting and receiving elements so that performance of 



* An intervening step between the "Mile of Standard Cable" and the "Trans- 
mission Unit" was the "800 Cycle Mile." This also was a distortionless logarithmic 
unit, equal in magnitude to the loss of a mile of standard cable at 800 Hz 
(actually, the loss at 796 Hz was used). There were thus two "800 Cycle Mile" 
units because of the two different, standard cable specifications. The "800 Cycle 
Mile" and its successor, the "Transmission Unit," differed in that (i) the former 
represented a current ratio while the latter represented a power ratio and (it) 
although both were logarithmic, the former was in units of log 1.115 (one 800 
Cycle Mile corresponded to a current ratio of 1.115) while the latter was in units 
of 0.1 log 10 (one Transmission Unit corresponded to a power ratio of 10 - 1 ). 
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commercial transducers, characterized by amplitude response curves 
with pronounced resonances, could be simulated. 

Transmission engineering was still done on a loudness basis, and 
the MRS was used to obtain ratings in the same manner as that 
previously described for the SCRS. That is, (i) speakers talked 
alternately over the reference and test systems, (ii) observers listened 
over the systems, switching with the talkers, and (Hi) the line (600- 
ohm attenuator) of the MRS was adjusted to obtain equal loudness. 
The average setting of the attenuator (in dB) at balance was then 
the rating of the system under test. 

Adoption of the MRS by the Bell System had important long-term 
effects on international standardization of telephone reference sys- 
tems. In 1926, the CCI* invited representatives of the Bell System 
to meet with a committee appointed by the CCI to consider adoption 
of a transmission reference system. 9 At the recommendation of this 
committee, the CCI adopted the MRS. Two of these systems were 
built by the Bell System. One was retained at Bell Telephone Lab- 
oratories in New York, the other was sent to the laboratory of the 
CCI in Paris in 1928. The CCI System was designated the SFERT.t 

In the late 1950s, the SFERT was replaced by a new reference 
system, the NOSFER.* The NOSFER is so constructed and calibrated 
that it is essentially equivalent to the SFERT. Telephone connection 
ratings obtained with either the SFERT or the NOSFER are desig- 
nated as Reference Equivalents (RE) in dB, and are numerically 
equal to the setting (in dB) of the reference system line attenuator 
required for loudness balance. 

2.2 Effective Loss Plan 

In the early 1930s, the Bell System adopted a new transmission 
plan. The change became necessary because of technological advances. 
Telephone sets incorporating antisidetone circuitry and improved 
transducers began to be used in quantity in the Bell System. These 
sets had poorer loudness performance than their predecessors, but 



* CCI = Coraite Consultatif International des Communications Telephoniques 
a Grande Distance (International Consultative Committee for Long Distance 
Telephone Communication). This organization later became the CCIF and is 
now the CCITT. 

t SFERT = Systeme Foudamental European de Reference pour la Trans- 
mission Telephonique (European Fundamental Reference System for Telephone 
Transmission). See Ref. 12. 

* NOSFER = Nouveau Systeme Foudamental pour la Determination des 
Equivalents de Reference (New Fundamental System for the Determination of 
Reference Equivalents). See Ref. 6. 
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provided marked improvement in such characteristics as sidetone, 
amplitude distortion, and nonlinear distortion, all of which have a 
marked effect on transmission. Evaluation of these telephone sets in 
terms of the old system tended to emphasize loudness differences at 
the expense of these other factors. Thus, there was need for a rating 
plan which properly recognized the improvements and, at the same 
time, retained applicability for the older telephone set types. The 
effective loss plan was devised to meet this need. 13 - 14 

Ratings under the new plan were in terms of dB of "effective" loss 
to distinguish them from "loudness" or "volume" losses of the old 
plan. Effective loss represented a figure of merit for evaluating the 
effectiveness of the transmission over telephone circuits and, as such, 
was a measure of the ability of telephone listeners to understand as 
well as to hear transmitted telephone speech. 

Effective transmission data were determined in terms of the Work- 
ing Reference System (WRS). 14 This system consisted of representa- 
tive customer loops and telephone sets and a variable, distortionless 
trunk, i.e., an attenuator. Ratings for commercial connections were 
obtained in effect by comparing these, using live talkers and listeners, 
with the reference system, adjusting the distortionless trunk until the 
reference system and test connection were judged to provide equivalent 
transmission. (Basic data for the plan were obtained from two-way 
conversation tests, in contrast to the earlier plan for which tests 
were on a listening-only basis.) The criterion for such equality was 
repetition rate, that is, the rate of occurrence of repetitions requested 
by test subjects. 13 ' 14 Thus, balance was achieved when the WRS (with 
a particular trunk setting) and the test connection provided the same 
repetition rate. The WRS trunk setting, in dB, was then a measure 
of the effectiveness of the circuit under test. The reference trunk 
setting for the Working Reference System was selected to provide 
numerical equivalence of ratings obtained with the new plan and its 
predecessor. With reference trunk setting, the Working Reference 
System had an effective loss of 18 dB which was also its transmission 
equivalent in terms of the earlier plan. 

During the early 1930s when the effective loss plan was under 
development, in-plant telephone set transducers were predominantly 
of the resonant type, characterized by large amounts of amplitude 
and nonlinear distortion, telephone sets were largely of the sidetone 
variety, and a number of low cutoff cable loading systems were in 
use. However, improved transducers and antisidetone sets were being 
introduced into the plant, and additional improvements were under 
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development. (The transitional nature of the telephone set plant 
at that time and additional planned improvements were important 
considerations leading to adoption of the effective loss plan.) This 
resulted in introduction of the 300-type telephone set into the plant 
in 1937. 15 ~" Subsequent study similarly resulted in introduction of the 
500-type telephone set into plant in 1950. 1819 The effective loss plan 
encouraged these improvements and, also, the adoption of higher 
cutoff cable loading systems. 

By the mid 1950s, telephone sets provided sufficient antisidetone 
and freedom from amplitude and nonlinear distortion that little could 
be gained by further improvements. Thus, loudness appeared to be 
once again a major variable factor in plant design, likely to be the 
most important consideration in subsequent telephone set design. 

The uniformly high level of transmission performance characteriz- 
ing the plant suggested revision of telephone transmission planning 
to reflect more emphasis of loudness effects. Impetus for such revision 
was provided by (i) the availability of a speech loudness computation 
technique and a laboratory measuring system based on this technique 
and (ii) the need for planning in simpler terms than was possible 
with the effective loss plan which suffered from a number of practical 
disadvantages. 3 

2.3 Loudness Rating Plan 3 

For present purposes, the loudness rating plan comprises definitions 
for loudness ratings of partial and overall telephone connections." 
Loudness is a subjective quantity and, therefore, loudness rating is 
defined as the ratio of the loudness of the speech into the listener's 
ear to the loudness of the speech out of the talker's mouth. As used 
here, however, loudness rating is considered to be determined by 
objective measurements. 

The rating definitions involve both acoustic pressures and electric 
voltages. The EARS provides a means of measuring these quantities. 
The EARS consists of a source of acoustic energy, comprising a com- 
plex voice-frequency test tone which simulates certain properties of 
human speech and an artificial mouth, and an indicating meter for 
measuring voltages or sound pressures in a manner which simulates 
the loudness perception of an average listener. 

Ratings obtained using the EARS do not precisely equal ratings 

* Loudness ratings as specified by the definitions are expressed in decibels. 
Such usage does not strictly conform to the definition of the decibel, but this 
should have no effect on general use of the term "decibel." 
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obtained using human beings, in part because the artificial mouth and 
the 6-cm' 1 coupler do not precisely duplicate the characteristics of their 
human counterparts. However, EARS ratings are sufficiently accurate 
to be highly useful in telephone transmission engineering although some 
problems associated with using the EARS to measure nonlinear trans- 
ducers, e.g., carbon microphones, remain to be solved. Continuing 
research aimed at characterizing human mouths and ears may result in 
artificial counterparts the use of which in the EARS will render the 
differentiation between EARS and subjectively determined loudness 
ratings unnecessary. 

The EARS is thus an instrument used to measure the loudness 
performance of telephone instruments and facilities in much the same 
manner that the 3A Noise Measuring Set is used to measure the 
subjective magnitude of telephone message circuit noise. 20 Loudness 
performance is expressed as constants and/or families of curves which 
are used in transmission planning. Examples of such use are the 
unigauge plan already cited 4 and current studies which are con- 
templating transmission design based on (i) noise and loss and on (it) 
noise, loss, and echo. 5 

Earlier rating plans utilized physical reference systems which 
simulated overall telephone connections. Loudness rating as considered 
here is based on measurement of physical quantities, and does not 
require specification of a standard simulated connection. 

III. COMPUTATION OF LOUDNESS OF SPEECH 

A number of tests have been conducted to determine the loudness 
of telephone speech signals. Study of these test results, and of various 
methods devised for computing the loudness of speech and tones, 
resulted in a particular speech loudness computational procedure. This 
procedure, developed some time ago but not previously reported in 
the literature, is discussed in some detail in the present section. 

We begin with a summary of the essential attributes of the pro- 
cedure with appropriate reference to functions required in the com- 
putation of loudness ratings. Following this, several methods for 
computing the loudness of continuous spectrum sounds are referred 
to, and the background for the specific computation method of con- 
cern herein is reviewed. We then derive the functions required for 
the computational procedure. Finally, the procedure is used to com- 
pute the loudness performance of several laboratory test systems, and 
the computed values are compared to experimental results. 
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The computational method is not based on fundamental properties 
of speech sounds and of the hearing mechanism and, therefore, is 
probably not generally applicable in determining the loudness of 
other types of sounds. Rather, the method is based on a "black box'' 
approach in which a denned "speech" signal is applied at the input, 
and the signal so processed that the output of the black box cor- 
relates with subjective experience of speech loudness. Internal opera- 
tion of the black box is specified, but such is not intended to reflect 
exactly the specific manner in which the human being processes speech 
stimuli to obtain loudness although similarities are noted. This ap- 
proach is used because, although there are a number of theories 
covering loudness perception, the specific operation of the hearing 
mechanism in determining loudness is not known. 

3.1 Description of Speech Loudness Computational Method 

The method, depicted on Fig. 1, is based on performing certain 
operations on the pressure spectrum delivered by a telephone con- 
nection to the ear of a listener. In essence, these operations comprise 
(i) dividing the received speech spectrum into a number of different 
frequency bands, (ii) determining the loudness due to each band, 
and (Hi) summing across all bands to obtain the total loudness. 

The received pressure spectrum consists of (i) a reference speech 
spectrum applied at the transmitting end of a connection modified 
by (ii) the amplitude transfer characteristic of the connection. The 
compromise spectrum and system response definitions are given on 
Figs. 2 and 3 respectively.* 

Not all of the received spectrum contributes to loudness, i.e., that 
portion of the spectrum lower in level than the threshold of hearing 
does not contribute. Account is taken of this by defining a quantity 
termed effective spectrum which is the received spectrum minus X, 
the threshold of audibility for continuous spectrum sounds. The X 
function is shown on Fig. 4. 

The effective spectrum is divided into 50 frequency bands selected 
such that each contributes equally to the total loudness produced by 
a flat effective spectrum. The frequency limits for the "2 percent" 
loudness bands can be derived from the function of Fig. 5. 

The effective level in each band is then converted to loudness, in 
loudness units (LU), using the function of Fig. 6, and the loudness 



♦The ordinate of Fig. 2 is in terms of dBt. For purposes of this paper, dBt 
= dB relative to 2 X 10' 5 newton/meter 2 ; dBt = 20 logu(2 X 10 _B ). 



ELECTRO-ACOUSTIC RATING SYSTEM 



2675 



BANDPASS 
FILTERS CONVERTER 



REFERENCE 
SPEECH 

SPECTRUM 
L S (dBt) 



COMMUNICATION 

SYSTEM 

FREQUENCY 

RESPONSE, R 

(dB), + =LOSS 



RECEIVED 
SPEECH 

SPECTRUM 

(dBt) ■ L s 

-R 



NOTE: L s ,R.XANOZs ARE 

FUNCTIONS OF FREQUENCY. 



EFFECTIVE 

SPEECH 

SPECTRUM. 

Z S (dB) « L 

-R-X 



THRESHOLD OF 

AUDIBILITY FOR 

CONTINUOUS SPECTRUM 

SOUNDS, X (dBt) 





F T0 
Fi 




Z S ,TD 
















F]TO 
F 2 




Z S2 T0 
n 2 














1 


Fi-i TO 
F| 




ZsJO 




1 








1 


1 


F«TO 
Fso 




ZS60TO 

n 50 





TOTAL 
LOUONESS, 

Ns 

(LOUDNESS 

UNITS) 

50 

= 2>i 

i= 1 



Fig. 1 — Computation of speech loudness. 



units summed for the 50 bands. The resulting sum, N a , represents 
the total loudness contained in the particular received spectrum. 

Once the total loudness, N H , is obtained, the problem of its inter- 
pretation arises. One useful way of expressing the loudness is in terms 
of loudness level.* Functions which may be used to convert N s to 
loudness level are shown on Fig. 7. 

A more significant way of expressing loudness for speech com- 
munication problems is in terms of the level of a reference speech 
spectrum. The function relating total loudness to the level of the 
selected reference speech spectrum is given on Fig. 8. 

3.2 Survey of Loudness Sludies 

Loudness has been studied extensively as demonstrated by the 
numerous publications relating to this subject. Studies reported have 
been largely concerned with single-frequency tones, tone complexes, 
and continuous spectrum steady sounds. Several different methods 
for computing the loudness of these different sounds have evolved. 1 
Some of the methods, specifically those covering continuous spectrum 



* The loudness level of a sound, in phons, is numerically equal to the median 
sound pressure level, in dBt, of a free progressive wave of frequency 1000 Hz 
presented to listeners facing the source, which in a number of trials is judged by 
the listeners to be equally loud. The unit of loudness is the sone. By definition, 
a 1000-Hz tone 40 dB above a listener's threshold produces a loudness of 1 sone ; 
the loudness of any sound that is judged by the listener to be n times that of 
the 1-sone tone is n sones. 21 In this paper, we will use the term loudness units 
(LU) when considering loudness of speech, and the term sones in discussing other 
sounds. However, speech of N LU is equal in loudness to any other sound of 
A r sones. 

t References 22 through 27 describe some of these computational methods. 
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Fig. 2 — Sound pressure spectrum of continuous speech at 2 inches from the lips 
of a talker. 



steady sounds, are sufficiently general that with appropriate specifica- 
tion of the speech signal they may be suitable for computing the 
loudness of speech.* Such suitability has not, to the author's knowl- 
edge, been demonstrated and these methods are not considered in this 
report. 

The study of speech loudness has received somewhat less attention 
than the general subject of loudness. Publications covering the study 
of speech loudness are limited in number, and of these, only a few 
consider speech loudness computational methods. + 

In 1924, H. Fletcher and J. C. Steinberg devised a method for 
computing the loudness loss, due to changing the transmission system 
response, of a sound being transmitted to the ear. 28 Results obtained 
using the method were in close agreement with experimental results 
for the specific sounds — speech and a complex test tone — considered 
in the study. 



* See, for example, Ref. 26. 

t References 28, 30, 31, 32, and 33 report results of some speech loudness tests. 
References 28, 29, and 31 describe speech loudness computation methods de- 
veloped. 
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In 1925, Steinberg developed a more general method of computing 
the loudness of any complex sound. 20 (The method referred to in the 
preceding paragraph was a special case.) Results computed using this 
method agreed with experimental results then available, including the 
data reported in the 1924 paper. 

In 1938, W. A. Munson developed a method for computing the 

(A) ORTHOTELEPHONIC RESPONSE DEFINITIONS 
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20 LOG V1/P1 




LINE 
RESPONSE = 
20 LOG V 2 /V! 






RECEIVING 
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20 LOG P 2 /V 2 


m- 








V 2 | 











Pi - PRESSURE AT 2 IN. IN FRONT OF REAL VOICE, TRANSMITTING ELEMENT ABSENT. 
V] - OPEN CIRCUIT VOLTAGE OF TRANSMITTING ELEMENT, WHEN SPOKEN INTO BY A REAL VOICE. 
P 2 = PRESSURE IN FREE FIELD AT CENTER OF LINE BETWEEN OBSERVER'S RIGHT AND LEFT EARS, 
OBSERVER ABSENT. 

V 2 = VOLTAGE ACROSS RECEIVING ELEMENT WHEN SOUND FROM RECEIVING ELEMENT IS AS LOUD 
AS THE SOUND FROM THE FIELD. 

NOTE: PRESSURE AND VOLTAGE REFERENCE LEVELS MUST BE STATED. 
(B) ORTHOTELEPHONIC SYSTEM 



ORTHOTELEPHONIC TRANSMITTING ELEMENT, RESPONSE = K 
RECEIVING ELEMENT, RESPONSE = -K 
LINE. RESPONSE = -25 



AS DEFINED 
IN SECTION A 



OVERALL, OOS = -25 
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(C) ACOUSTIC PATH SIMULATED BY ORTHOTELEPHONIC SYSTEM 
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Fig. 3 — Orthotclcphonic transmission. 
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loudness of speech. The method was based on speech loudness tests 
conducted in 1935 and 1936." 

Thus, there were available at least two methods, those of Steinberg 
and Munson, for computing the effects of changes in the response 
characteristics of telephone circuits on speech loudness. These methods 
had been found to provide computed results which were reasonably 
consistent with observed results from experiments on which the 
methods were based. A review of these experiments revealed sub- 
stantial inconsistencies between the different sets of data, leading to 
doubt concerning the generality of the procedures. Moreover, a sub- 
stantial amount of additional speech loudness data became available 
in 1939 and 1940. These considerations led to a review of the available 
speech loudness computation procedures and the various sets of sub- 
jective test data for purposes of arriving at a simple procedure, suitable 
for telephone engineering purposes, which would be reasonably con- 
sistent with all of the data. 



* Neither the computational procedure nor the tests were reported in the 
literature. 
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Fig. 7 — Relation between loudness and loudness level. 



The speech loudness computation procedure resulting from the 
review referred to above is described in subsequent paragraphs. This 
procedure, developed in 1941, and the 1939-40 speech loudness data 
have not been previously reported, primarily for two reasons. First, 
the group concerned with the speech loudness project was essentially 
disbanded in late 1941 and assigned other work in support of the war 
effort. This group was not re-established so that its accomplishments 
could be reduced to a form suitable for publication in the literature. 
Secondly, there was no pressing need for the procedure until the loud- 
ness rating plan was proposed. 3 Review initiated at that time has 
verified applicability of the procedure to telephone speech loudness 
problems. 

3.3 Development of the Speech Loudness Computation Method 

The speech loudness computation method evolved from a specific 
method devised by Fletcher and Munson for treatment of continuous 
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spectrum sounds, and uses as many of the steady-state definitions as 
are applicable. 23 Review of the Fletcher-Munson method therefore 
serves as a convenient introduction. 

3.3.1 Steady- State Sounds — Definitions and Formulae 

Loudness, N, is defined as the intensive attribute of an auditory 
sensation. 21 Loudness can be expressed in terms of a scale whose units 
have been found to agree with the common experience of observers 
estimating the intensive attribute of sounds. Ordinarily, the loudness 
of a sound is not specified in terms of units of a loudness scale, but, 
rather, in terms of its loudness level. (See Section 3.1 footnote.) 

Results of some experiments to determine the relation between 
loudness and loudness level are shown on Fig. 7. The solid curve, due 
to Fletcher and Munson, was first proposed in 1933. 22 

This curve was later adopted in an American Standard. 35 Further 
study by Fletcher and Munson resulted in a slightly different relation 
shown by the long dashed line. This curve, first reported in 1937, 23 
was later modified slightly.* The short-dashed curve, due to S. S. 
Stevens, resulted from effort directed at obtaining the best fit for all 
available data with the simplest possible relation. 2030 ' 37 This relation 
is included in a USA Standard. 38 

The loudness, N, for steady-state sounds can be computed from the 
masking spectrum, M, of the sound. 1 The loudness equation may be 
written as 



.V 



= / F(M) dx. (1) 



F(M), expressed in loudness units, is a function of the masking, and 
may be interpreted as expressing the intensity of the nerve stimulation 
at a particular position, x, along the basilar membrane. The quantity, 
x, is expressed in terms of the percent of total nerve endings that the 
maximum stimulation passes over as a stimulating tone is changed 
from lowest audible frequency to a frequency /, corresponding to x. 
The product F(M) and dx represents the loudness contributed by a 
sound within the differential length dx and, consequently, the integra- 
tion represents the total nerve stimulation, or loudness, over the length 
of the basilar membrane. 

For continuous spectrum sounds, the masking, M* at any fre- 



* Figure 137, Ref. 27. 

t See Ref. 23 and 39 for more detailed discussion of the concepts considered 
in this and subsequent paragraphs. 

t M = P — /So where /3 = the level of a single frequency tone which is just 
audible in the presence of a specified noise and /3o = the level of the same tone 
which is just audible when the noise is absent. (See Ref. 39.) 
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quency is related to the effective level, Z, of the masking sound at 
the same frequency. Thus, the loudness formula (1) can also be 
expressed as follows: 



N = J Q(Z) dx. 



(-') 



Interpretation of this formula is the same as that for (1) in that it 
represents the integrated nerve stimulation over the length of the 
basilar membrane. However, the intensity of the stimulation at any 
frequency is determined from the effective level of the continuous 
spectrum sound, Z, defined as follows: 



Z = B - X 



(3) 



where 



Z = effective spectrum level (dB) 

B = pressure spectrum level (dBt) 

X = a function (dBt), empirically determined from listening tests, 
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which reflects the fact that not all of the objectively measurable 
spectrum (B) is effective in producing loudness. 

The X function plotted on Fig. 4 may be considered to be the thresh- 
old of audibility for continuous spectrum sounds. Although this is 
not strictly correct, it lends physical interpretation to the function 
and, as such, facilitates understanding of the loudness computation 
technique." 

The loudness, N, of a continuous spectrum sound can be computed 
by (i) dividing the audible frequency range into a number of suc- 
cessive frequency bands conveniently selected for computational pur- 
poses, (n) determining from the effective level and the importance 
of each band the number of loudness units contributed by the band, 
and (Hi) summing the loudness units over the audible range. This 
procedure, replacing the integration of (2) , may be written as 

N = J>, Ax (4) 

where 

N = total number of loudness units, 

n z = loudness, Q(Z) from equation (2), integrated over a unit length 

of the basilar membrane, and 
Ax = the number of unit lengths along the basilar membrane in- 
cluded in the computation band. f 

The quantity, n s , is a function of the effective level, Z, of that portion 
of the sound lying in the computation band. 

3.3.2 Speech Sounds — Definitions and Formulae 

Development of the speech loudness formula was based on the 
assumption that a formula of the same general type as (4) could be 
used, that is, the loudness of speech can be computed by dividing 
the speech spectrum into a number of successive bands selected for 



* Consider the case in which no masking occurs, i.e., M = 0. Assuming M 
= Z, then Z = B — X = 0, and X = B, the pressure spectrum level of a noise 
which produces zero masking, hence zero loudness. X can thus be defined as the 
threshold of audibility for continuous spectrum sounds, useful for the purpose 
indicated, but not strictly correct since it has been found that at low values of 
M (e.g., M = 0), the relationship M = Z is not valid. (See equation (10-5) of 
Ref. 27.) 

t In Ref. 23. the unit length was taken as 1 percent of the basilar membrane 
length. 

* Many of the steady-state definitions of Section 3.3.1 apply also for speech. 
However, since some of the functions differ, the subscript s is used where 
appropriate to denote those specifically applicable to the speech case. 
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convenience of computation, determining the loudness contributed by 
each band, and summing the loudness over the audible range. This 
can be expressed as follows: 

N. = 2 n. AS (5) 

where 

n, = total loudness produced by speech in a frequency band of unit 

importance 
AS = the number of bands of unit importance included in a com- 
putation band. 

The quantity, n, , is a function of Z, , the effective level of the received 
speech. Following the definition used in the formulation for continuous 
spectrum steady sounds, Z, is defined as follows: 

Z, = B a - X (6) 

where 

B, = sound pressure spectrum level of received speech (dBt)* 
and X is defined in equation (3) . 

The bands of unit importance for equation (5) are so chosen that 
no matter where they are located on the basilar membrane, each 
contributes an equal amount to the total loudness when the effective 
level of the speech is independent of frequency, i.e., Z 8 = k (a con- 
stant) across the audible range.* When the speech band is divided into 
a number of bands suitable for computational purposes, each of these 
may contain several bands of unit importance. Thus, the loudness, n s , 
contributed by a band of unit importance must be multiplied by the 
number of such bands, AS, in the computation band. The value of 
AS used for any particular computation band depends on the width 
of that band (in Hz) and its location on the frequency scale. 

The summation of AS along the frequency scale represents the 
cumulative number of bands of unit importance. The rate of change 



♦Speech varies rapidly in level with time. Accounting for this variability 
would have entailed development of complex formulae from the limited amount 
of applicable data available on speech. The approach followed was to develop 
the simplest, possible method of computing speech loudness from the long-term 
average sound pressure spectrum of speech where long term is understood to 
comprise a time interval which is long compared to the average syllabic interval. 

t Selection of the bands is based on a flat effective spectrum for purposes of 
simplicity. Any other shape, i.e., Z, ^ k, would require a different loudness 
versus Z. function for each band of unit importance. 
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of S with frequency may, therefore, be interpreted as showing the 
relative importance to loudness of equal effective levels of speech in 
different frequency regions. This interpretation of AS and the inter- 
pretation of n 8 as the loudness of a band of unit importance, based 
on the similarity of the quantities n r and A.r in equation (4) and n„ 
and AS in equation (5) , are probably not exact because of the radically 
different nature of speech and steady-state sounds. These interpreta- 
tions should, therefore, be applied with caution, and regarded pri- 
marily as a point of view helpful in systematizing the basic data and 
in understanding the loudness formulations. 

3.3.3 Received Speech Spectrum 

For any given telephone connection, the sound pressure level of the 
received speech [B 8 of equation (6) ] and, hence, the loudness of the 
speech depends on (i) the speech spectrum applied at the talking end 
of the connection and (ii) the loss of the connection as a function of 
frequency. These factors are taken into account in the speech loud- 
ness computation method by assuming a compromise speech spectrum 
and expressing the connection loss in orthotelephonic terms. 

3.3.3.1 Compromise Speech Spectrum. The compromise spectrum 
adopted is shown on Fig. 2 and is designated B 90 because the area under 
the curve integrates to a total sound pressure level of 90 dBt.* This 
spectrum was obtained by averaging and smoothing long average power 
measurements of continuous speech (including pauses between words) 
from 13 males and 12 females. 1 Although some differences in spectral 
content were found between individual voices, the shapes of the spectrum 
curves were sufficiently alike to justify averaging across all voices. 
Moreover, measured spectra from other studies exhibit similar 
shapes. 41 - 43 

Several experimenters have reported measurements of the long- 
term average sound pressure level of speech. Results of these experi- 
ments, referred to a point 2 inches in front of a talker's lips, are given 
in Table I. (Selection of the 2-inch reference point is discussed in the 
next section.) 

The values are in fair agreement excepting the last row. Benson and 
Hirsh considered the apparent discrepancies at some length, con- 
cluding that values reported by Dunn and White and by Rudmose, 



* This spectrum is the same as that used in computing articulation index. See 
Fig. 2 of Ref. 39. 

+ Spectrum curves reported in Ref. 40 provided some of the data used in 
deriving this compromise spectrum. 
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Table I— Long-Term Average Sound Pressure Level of Speech 
at 2 Inches from Talkers' Lips 



Source 



Sivian — Ref. 45 

Dunn and Farnsworth — Ref. 44 
Dunn and White— Ref. 40 
Rudmose, et al — Ref. 42 
Benson and Hirsh — Ref. 43 



Males 



No. 



Pressure 
(dBt) 



91.8 
90.4 
90.8* 
92.6* 

81.7* 



Females 



No. 



Pressure 
(dBt) 



90.0 

88.0* 

79.7* 



* Values reported were converted to apply at 2 inches by assuming an equivalent 
point source 0.6 cm behind the talker's lips. See Ref. 44. 



et al., "approximate more nearly monitoring levels (peaks of energy 
probably corresponding to vowel sounds) than to an actual average." 43 

There is thus some uncertainty regarding the sound pressure level 
at 2 inches from a real voice. For present purposes, the level is as- 
sumed to be 90 dBt, an approximate average for the majority of the 
data (specifically excludes the Benson-Hirsh data). As will become 
evident, the exact value assumed for the level at 2 inches does not 
significantly affect the speech loudness computational procedure as 
long as the spectrum shape (see Fig. 2) is not changed. 

3.3.3.2 Orthotelephonic Transmission. The concept of orthotelephonic 
transmission is based on relating telephone and face-to-face conver- 
sation. 16 Specifically, orthotelephonic transmission for present purposes 
implies reproduction by a telephone system of speech sounds which are 
indistinguishable from those received with an air transmission system. 
Conditions applicable for the latter case are negligible noise and acoustic 
reflections when listening monaurally at a distance 1 meter from the lips 
of the talker, the listener facing the talker. An orthotelephonic system 
is thus one that provides the required orthotelephonic reproduction of 
the speech sounds. 

Application of the orthotelephonic concept to telephone connec- 
tions is detailed in Fig. 3. The definitions are given here as a matter 
of convenience, and will be considered where appropriate in later 
discussion. Typical orthotelephonic amplitude responses of overall 
telephone connections employing specific telephone set types are given 
in Ref. 16. Such responses vary widely depending on the telephone 
set and transducer types used. 

One point worth noting is that the orthotelephonic transmitting 
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response definition of Fig. 3 is in terms of the sound pressure level at 
2 inches from a talker's lips whereby strict conformance with the 
orthotelephonic definition would require expression in terms of the 
1 -meter sound pressure level. This change was made for two reasons: 
(i) the 1 -meter reference involves a long acoustic path which is 
ascribed to the transmitter; (ii) measurements at 2 inches are more 
reproducible since they are less likely to be affected by characteristics 
of the surrounding environment, e.g., reverberation. However, studies 
show that under free-field conditions the sound pressure spectra at 
2 inches (approximately 5 cm) and at greater distances have about 
the same shape,'"" but differ in absolute level by about 25 dB for 
the 1-meter and 2-inch positions. 44 

3.3.4 Derivation of ??, and S Functions 

The speech loudness formula [equation (5)] involves the two 
quantities n H and S. As noted earlier, n„ is a function of Z„ and is the 
loudness due to that portion of speech contained in a frequency band 
of unit importance; «S is a function of frequency which shows the 
relative importance to total loudness of equal effective levels of speech 
in different frequency regions. These functions, though somewhat 
interrelated, may be thought of as providing level weighting and fre- 
quency weighting respectively. 

Data from three tests were used in deriving these functions. The n s 
function was based on tests by-Munson the results of which show 
how loudness varies as a function of received speech level. Data 
obtained by Steinberg and by Van Wynen were used to derive the S 
function. These data show the manner in which loudness depends on 
frequency content of the received speech signal. Other data were 
available, but were not used because (i) there were gross inconsisten- 
cies between the data and data from other tests and/or (ii) the test 
system and conditions were not reported in sufficient detail to permit 
utilization of the test results. 

An important factor in deriving the n s and S functions is Z„ , the 
effective level of the received speech. [See equation (6).] Effective 
levels for the test results used in the derivation are shown on Fig. 9. 
The reference effective spectrum curve, consisting of the compromise 
speech spectrum, B w , , of Fig. 2 minus the threshold function, X, of 
Fig. 4, is also plotted on Fig. 9. This function, utilized in the speech 
loudness computation technique later, is assumed to represent the 
average case for a large number of talker-listener pairs conversing 
over a telephone system with ii) orthotelephonic response and (ii) 25 
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Fig. 9 — Effective spectrum level curves. 

dB of flat gain relative to a true orthotelephonic system. The function 
is given here to illustrate the difference between the effective spectrum 
level curve for a system providing orthotelephonic response and the 
effective spectrum level curves for the Steinberg and Van Wynen test 
systems. 

Because n s and S functions were interrelated, their derivations were 
carried out concurrently by means of a series of successive approxima- 
tions. For simplicity, the two functions are treated separately in suc- 
ceeding sections, and the derivation is described in greatly simplified 
terms. The actual derivation involved obtaining a first estimate of 
the S function, using the Steinberg and the Van Wynen data (includ- 
ing the respective loudness law exponents— see later discussion), for 
correcting the actual Z 8 to a constant Z„ . This S function was then 
applied to the Munson data in order to obtain a curve of total loud- 
ness versus speech sound pressure level for a 100-5000 Hz band, the 
band of interest in the speech loudness computation method. The 
ordinate and abscissa of this curve were then corrected to reflect 
respectively the loudness in a band of unit importance and the Z„ 
in a band of unit importance. 
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These first estimates of the .S and n„ functions were used to compute 
the loudness for a number of conditions. Computed results were com- 
pared to experimental results, and the functions modified as indicated 
by this comparison. This process was continued until computed values 
reflected adequate accuracy for engineering purposes. 

3.3.4.1 The S Function. The S function was derived from results of 
tests by Steinberg (see Section 3.5.1) and by Van Wynen* (see Section 
3.5.2). In these tests, speech received through high- and low-pass filters 
was balanced against undistorted speech. Test results important in 
the derivation are given in Fig. 10. The intersection of smooth curves 
drawn through the test results provides two items of information: 
(i) /.'.o , the frequency above and below which the loudness contributions 
are equal ; f (it) LL 60 , the number of dB by which undistorted speech 
must be reduced for a 50-percent reduction in loudness. 

By assuming a logarithmic relationship between loudness and effec- 
tive level (see Section 3.3.4.2), we can determine the percentage of 
loudness contribution as a function of frequency. Thus, 

Z,, - Z„ - k log j£ (7) 

where Z 8l and Z a2 are effective levels before and after a flat change 
in undistorted speech level, A\ and A r 2 are corresponding loudness 
numerics, and k is a constant. For the Steinberg tests, we find, sub- 
stituting —9.1 dB for Z a2 — Z*i and 1/2 for A r L ./A r i (50-percent loud- 
ness reduction), that 

k = - Z " ~ Z " = f££ = 30.2 dB 

while for the Van Wynen tests, k = 41.2 dB. 

We can express the relationship of equation (7) in the power func- 
tion form frequently encountered in current literature 33 - 30 by letting P 
represent the stimulus magnitude and setting it equal to Antilog Z/20. 
Then 

N = P" (8) 



* Steinberg reports results for sensation levels = 100 dB and 39 dB. 20 (Sensa- 
tion level is the level above threshold. See Ref. 21.) Only the 100-dB results 
were used in the derivation. Results for the 39-dB level were near the knee 
of the loudness curve (see Fig. 8 for example) and their inclusion would have 
substantially complicated the derivation. 

t This assumes ideal filters. The filters used in the tests appear, for speech 
signals, to be sufficiently close to ideal to warrant making this assumption. 
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Fig. 10 — Filter test results. 



where N and P are as defined above and n = 20/k. The values of n 
are then 0.662 and 0.486 for the Steinberg and Van Wynen tests 
respectively. 

By applying equation (7) to the curves of Fig. 10, we can determine 
the loudness reduction corresponding to various values of loudness 
loss. For example, the Van Wynen tests show that for a high-pass 
filter with cutoff frequency = 550 Hz, Z g2 - Z al = -10.3 dB. Using 
the equation, we find that the 550-Hz high-pass filter passes 56.2 
percent of the loudness, suppressing 43.8 percent. Proceeding in a 
similar manner for all of the data, the points shown on Fig. 11 were 
obtained. It is of interest to note that there appear to be no systematic 
differences between the Steinberg and Van Wynen test results even 
though the exponents referred to in the preceding paragraph are 
somewhat different. 

Because the effective spectrum level curves for tests were almost 
identical (see Fig. 9), the smooth curve of Fig. 11 was drawn to 
represent all of the data points. This curve shows the percent of total 
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loudness contributed by frequencies below the frequency of the 
abscissa for the effective spectrum level used in these tests. 

The S function required, showing the cumulative contribution to 
loudness when the effective spectrum level is flat with frequency, is 
plotted on Fig. 5. The function was obtained from the curve of Fig. 
11 by successive approximations as outlined in the introductory 
paragraphs of Section 3.3.4. 

3.3.4.2 The n, Function. The n, function, relating loudness and effec- 
tive speech level in a band of unit importance, was derived using test 
results obtained by Munson in 1935-1936. The test system and general 
methodology are described in Ref. 46. However, the test results have 
not previously been reported in the literature. 

In the Munson tests, undistorted speech was loudness balanced 
against a 1000-Hz test tone. Also, thresholds were determined for the 
1000-Hz and speech signals. Detailed results of the tests are given in 
Tables II and III. Average results are plotted on Fig. 12. 

The ordinate of Fig. 12 was converted to a loudness scale using the 
solid curve of Fig. 7. The resulting relation between loudness and 
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Table III — Results of Munson (1935-1936) Threshold Tests 





Speech 




1000-Hz Tone 
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speech sound pressure level is shown on Fig. 13. (The ordinate of Fig. 
13 is in terms of Loudness Units — LU rather than sones. See footnote 
of Section 3.1.) 

In terms of equation (8), the curve of Fig. 13 at higher sound 
pressure levels represents an exponent of 0.455. This is in close agree- 
ment with the Van Wynen test results with exponent = 0.486 (see 
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Fig. 13 — Loudness (N,) vs speech sound pressure level for Munson (1935-1936) 
tests. 



previous section), with the "Derived" (from balancing speech against 
a 1-kHz tone) and "Direct" (from adjusting a speech signal to sound 
"half as loud" and "twice as loud" as a standard speech signal) from 
Ref. 32 with approximate exponents of 0.50 and 0.46 respectively, and 
with the derived curve of Fig. 52, Ref. 31, with an exponent of about 
0.51. The exponent does not agree with that from the Steinberg tests 
which was 0.662 (see previous section) , with the exponent of about 
0.65 determined from the monaural versus binaural loudness matches 
of Ref. 32, with the exponent of 0.7 determined from the single 
phoneme magnitude estimation study of Ref. 33, and with the ex- 
ponent of 0.6 determined from the limited magnitude estimation work 
reported in Ref. 47. 

The relation of Fig. 13, applicable for wideband speech having the 
effective spectrum shown by the long-dashed curve of Fig. 9, was then 
modified to apply for narrow bands of speech, (bands of unit im- 
portance) and a flat effective spectrum. To do this, effective spectra 
were plotted for several of the loudness balance tests, e.g., see Fig. 9. 
For each of these, the frequency scale was divided into various num- 
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bers of bands of equal importance as determined from Fig. 5. It was 
then assumed that Z* was constant across each band of unit im- 
portance, and was equal in magnitude to the Z H at the center frequency 
of the unit band. This resulted in several "staircase" approximations 
to the smooth curve, each exhibiting a granularity depending on the 
number of bands selected. Study of the various spectra indicated that 
using 50 bands of unit importance, referred to as 2-percent loudness 
bands, represented a reasonable compromise between the conflicting 
requirements of (i) accuracy, improved by increasing the number of 
bands and thus obtaining a closer match between the "staircase" ap- 
proximations and the smooth curves, and (ii) simplicity, obtained by 
reducing the number of bands. 

The required function of n, versus Z s , shown on Fig. 6, was ob- 
tained from Fig. 13 by a series of successive approximations as out- 
lined in the introductory paragraphs of Section 3.3.4. Specific steps 
involved included (i) adjusting the ordinate and abscissa scales of 
Fig. 13 to represent a band of 100-5000 Hz (required making allow- 
ance for the fact that the effective level of the speech spectrum used 
in the Munson tests was not flat with frequency) and (ii) dividing 
the ordinate by 50 (to reflect contribution by each 2-percent band) 
and by 2 (since the Munson test results and the solid curve of Fig. 7 
are both in terms of binaural listening while the case of usual interest 
in telephony is monaural listening) . 

3.4 Computational Procedure 

Mechanics of the computation for any given telephone connection 
consists of two parts, (/') the computation of the loudness, A r ., , of the 
received speech, and (ii) interpretation of the computed loudness 
numbers. The former is carried out utilizing a computational form 
based on Figs. 5 and 9. 

The computational method to be described applies specifically to 
the monaural case, of primary interest in telephone problems. If 
speech is received binaurally, we assume that the "average" listener 
is symmetrical and compute the loudness separately for each ear, 
adding the resulting loudness numbers to obtain the total loudness. 
[See equation (5).] Thus, in cases where the speech signals delivered 
to the two ears are identical, the resulting loudness is twice that 
obtained with one ear alone. 

3.4.1 Computation of Speech Loudness, X. 

The procedure followed in computing speech loudness is most read- 
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ily described in terms of the computation form shown in Fig. 14. The 
computation involves four steps: 

(1) Calculation of the received speech signal effective level, Z» , at 
a number of selected frequencies; 

(2) Determination of n, , the loudness in a band of unit importance, 
from Z. ; 

(3) Determination of the loudness in a computation band, the 
product of n, and AS; 

(4) Summation of the loudness numerics across the band of interest. 

Column 1 of Fig. 14 lists the frequencies at which the computations 
are made and columns 2 through 7 are used for determining Z s . The 
values entered in column 2, obtained from Fig. 9 (reference effective 
spectrum) , apply for an acoustic talking level of 90 dBt. 

Columns 3, 4, and 6 are respectively: T ot , the orthotelephonic re- 
sponse of the transmitting element; R ot , the orthotelephonic response 
of the receiving element; and L, the response of the circuit inter- 
connecting the transmitting and receiving elements. (Refer to Fig. 
3 for appropriate orthotelephonic definitions.) In many cases of 
interest in telephony, the same transmitting and receiving elements 
are likely to be used in many computations. Column 5 of the form 
provides for combining responses of these elements with B 90 — X. 

The effective level of the received speech spectrum, Z 8 , resulting 
from combining entries of columns 5 and 6, is entered in column 7. 
The effective level in each computation band is converted to a loud- 
ness number for a band of unit importance using the curve of Fig. 6. 
The loudness numbers, entered in column 8, apply for frequency 
bands having 2-percent importance. The bands of Fig. 14 with mid- 
frequencies of 600 Hz and above have been selected to give 2-percent 
importance. Below this frequency, however, the 2-percent bands be- 
come very narrow (see Fig. 5), and circuit measurements at the re- 
quired frequencies provide a fineness in the response curves not re- 
quired for telephone rating purposes. Thus, wider bands are used 
and the greater importance of these bands allowed for by means of 
column 9 which shows the number of 2-percent bands in each compu- 
tation band. The loudness for a computation band is thus the product 
of column 8 and column 9 entries, and is entered in column 10. The 
total loudness, N„ , is obtained by summing column 10. 

In some cases, the computation frequency is not the midfrequency 
of the computation band according to Fig. 5. The midfrequencies 
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Fig. 14 — Speech loudness computation form. 



were rounded off as shown for purposes of convenience. For the 
lowest computation band, this rounding off overemphasizes that band's 
contribution to the loudness, but this generally affects the total loud- 
ness number, N 8 , only slightly. 

The computation bands and the computation frequencies of Fig. 14 
were selected for convenience. These can be changed, e.g., to consider 
1/3-octave bands, by first locating the band limits on Fig. 5, then 



2698 THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1971 

determining (i) the center frequency for, and (ii) the number of 
bands of unit importance contained in, the selected computation band. 
Appropriate values of B 9 „ — X can be obtained from Fig. 9. 

The procedure described above applies for cases where the loudness 
of a band of speech is determined by the effective level of the speech 
within that band. Masking experiments have shown that when the 
effective level of a continuous spectrum sound changes abruptly with 
frequency, e.g., a sharp cutoff filter applied to thermal noise, there 
may be masking in the suppressed frequency band due to energy in 
the passed frequency band. Since masking has been identified with 
loudness, it is reasonable to assume that the loudness of a band- 
limited speech signal will depend not only on the passed region but 
the suppressed region as well. This effect, termed spread-of-loudness, 
is considered in the computation by assuming that Z 8 does not decrease 
by more than 10 dB between adjacent 2-percent loudness bands.* If 
Z 8 (column 7 of Fig. 14) shows a more rapid change, a new Z g is en- 
tered which is exactly 10 dB below the Z, of the preceding or succeed- 
ing 2-percent band. 

3.4.2 Interpretation of Total Loudness, N, 

Loudness in loudness units (LU), although a ratio scale reflecting 
observers assessment of sound magnitude, has little physical signif- 
icance. It is thus desirable to express loudness in such a way as to con- 
vey a physical interpretation. This can be done by expressing the 
loudness of a sound in terms of the level of a reference sound which 
has the same number of loudness units. Thus, speech loudness can 
be expressed in terms of loudness level, i.e., the level of a 1000-Hz 
tone having the same loudness. Loudness level is an appropriate way 
to express speech loudness when comparing computed values with 
results of tests which involved loudness balancing speech and a 1000- 
Hz tone. Speech loudness (N s ) can be computed using the form of 
Fig. 14, then with iV„ = N 1(m) , the loudness level found from Fig. 7. 

Often speech through a test circuit is balanced against speech 
through some relatively distortionless reference circuit. In these cases, 
the loudness of the speech transmitted over the test circuit should 
probably be expressed in terms of the setting of the reference circuit. 



* The spread-of-loudness effect was determined from masking data obtained 
by Fletcher and Munson (Figs. 11 and 12 of Ref. 23 or Figs. 141 and 142 of 
Ref. 27) using narrow bands of noise. The masking data were plotted in terms 
of number of 2-percent bands below and above the nominal bandlimits of the 
noise. A straight line with a slope of 10 dB/band fitted the data with reason- 
able accuracy, thus permitting use of a simple rule for taking spread-of-loudness 
effects into account. 
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Thus, speech loudness can be expressed in terms of several different 
reference sounds. A single reference would be desirable to enable 
comparison of results from different tests. Loudness level would be 
suitable for such purposes. However, considering the different nature 
of speech and a 1000-Hz tone, a speech signal of specified character- 
istics might be a more suitable reference, one that would be more 

intuitively satisfying. 

The reference selected for the speech loudness computation pro- 
cedure is that due to transmitting average speech (Fig. 2) having the 
reference effective spectrum (Fig. 9) over a circuit that is flat on an 
orthotelephonic basis (Fig. 3) except for sharp cutoffs at 250 and 
4000 Hz. (These frequency limits, selected somewhat arbitrarily, were 
wide enough to include existing and planned commercial telephone 
circuits.) The loudness of speech from any telephone circuit can then 
be specified by the level of the reference that gives an equivalent N a ; 
the level, in turn, can be conveniently specified by its integrated rms 
pressure in dBt. The level of the reference speech spectrum, desig- 
nated as L 8 , is related to its loudness, N, , as shown on Fig. 8. This 
curve was derived using the form of Fig. 14, successively changing 
entries per column 2 thereon, and computing N a for each such change. 
It should be noted that (i) L a for the entries of column 2 is 90 dBt for 
the band 100-5000 Hz, 89.4 dBt for the band 250-4000 Hz and (ii) 
spread-of-loudness was taken into account in the computation of N a . 

3.5 Comparison of Computed and Observed Results 

Observed results from six different subjective tests were compared 
to computed results obtained using the computation form of Fig. 14 
and the frequency response characteristics of the subjective test sys- 
tems. (Observed results from two of these tests were used in develop- 
ing the computational method. ) Observed results comprised reference 
circuit attenuator settings or trunk settings obtained by observers 
when they loudness-balanccd reference circuits and test circuits. 

Computed reference circuit settings were determined by first com- 
puting the loudness (N 8 ) for each test condition and its corresponding 
reference condition (a reference circuit with its attenuator or trunk 
setting at the value to which observers adjusted to obtain loudness 
balance) using the test and reference circuit frequency response char- 
acteristics.* The computed loudness values (A r g ) were then converted 
to levels of the reference spectrum (L,). By comparing the L„ of each 

* Frequency response characteristics used for the loudness computations dis- 
cussed in this paper are not reported. 
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test circuit with the L„ of corresponding reference circuit at observed 
balance, computed settings were obtained. 

Computed and observed results are discussed and tabulated in Sec- 
tions 3.5.1 thru 3.5.6 and are plotted on Fig. 15. (For the latter, the 
L a for the reference circuit at balance is considered to be the observed 
value, the L, for the test circuit is the computed value.) Comparisons 
of computed and observed results are summarized in Table IV in 
terms of error distributions. These, together with Fig. 15 and tabu- 
lated values of ensuing sections, indicate that the computational 
method provides a high degree of accuracy. 
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Fig. 15 — Comparison of reference (observed) and test (computed) speech 
loudness. 
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Table IV — Summary of Comparison' of Computed and Observed 

Results 







Computed Minus 






Observed 




Detailed 










Test Designation 


Results In 


Average-dB 


cr-dB 


Steinberg tests (1924) 








Sensation level = 100 dB 


3.5.1 


1.4 


1.1 


Sensation level = 39 dB 


3.5.1 


4.0 


2.2 


Van Wvnen linear 








system tests (1940) 


3.5.2 


-0.8 


2.1 


Van Wvnen telephone set 








tests (1940) 


3.5.3 


0.3 


1.1 


Van Wvnen tie line 








tests ( 1940) 


3.5.4 


-0.4 


0.9 


Fraser tests (1940) 


3.5.5 


0.2 


0.9 


CCIF tests (1939) 


3.5.6 


0.8 


1.1 



3.5.1 Steinberg Linear System Tests (1924) 

In 1924, J. C. Steinberg conducted loudness balance tests with a 
system which utilized linear transducers interconnected by either of 
two channels, a test channel into which various filters were inserted 
and a reference channel which provided essentially distortionless 
transmission. 2S,2!) Observers adjusted an attenuator in the reference 
channel to obtain loudness balance for each of a number of the test 
networks. 

Computed and observed loudness losses are given in Table V. For 
purposes of the table, loudness loss is equal to the amount of flat loss 
by which the reference circuit was varied from its reference setting 
(for the particular sensation level) so that speech through the 
reference channel had the same loudness as speech through the test 
channel. Observed values given for the 100-dB sensation level were 
used in deriving the S function. (See Section 3.3.4.1.) 

For the reference channel, L s with the 100-dB sensation level 
reference setting was 107.1 dBt; with the 39-dB sensation level 
reference setting, L s = 42.6 dBt. Observed and computed loudness 
losses per the table may be converted to L s values for the reference 
channel by subtracting entries in the "observed" column from the 
above reference values; L„ for the test channel may be obtained by 
subtracting entries in the "computed" column, for the appropriate 
test network, from the reference values given above. Thus, the ref- 
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erence channel L g = 102.1 dBt and the test channel L 8 = 100.1 dBt 
for a 500-Hz high-pass filter and a 100-dB sensation level. 

Agreement between computed and observed values is reasonably 
good for the most part. In general, the agreement becomes poorer with 
increasing distortion and is poorer at the lower sensation levels. 
Neither of these affects is particularly restrictive from a telephone 
engineering standpoint since they represent extremes which a well- 
engineered telephone plant will seldom approach. 

3.5.2 Van Wynen Linear System Tests (1940) 

In 1940, K. G. Van Wynen conducted loudness balance tests using 
a system equipped with linear transducers. (These tests were not 



Table V — Computed and Observed Loudness Loss Values for the 
Steinberg Linear System Tests (1924) 





Sensation Level = 100 dB 


Sensation Level = 39 dB 




Loudness Loss (dB) 


C = 
B - A 


Loudness 


Loss (dB) 


F = 
E - D 


Test 


A 


B 


1) 


E 


Network* 


( Ibserved 


Computed 


(dB) 


( Ibserved 


Computed 


(dB) 


125 Hz HPF 


0.6 


0.2 


-0.4 





0.2 


0.2 


250 


2.0 


2.7 


0.7 


0.4 


1.0 


0.6 


375 


3.6 


5.0 


1.4 


-0.2 


1 .5 


1.7 


500 


5.0 


7.0 


2.0 


1 .0 


4.X 


3.8 


625 


7.6 


0.0 


1.4 


3.0 


5.X 


2.8 


750 


S.5 


10. s 


2.3 


0.6 


7.4 


6.8 


1000 


10.5 


13.8 


3.3 


3.2 


0.8 


6.6 


1250 


13.2 


16.2 


3.0 


4.5 


11.3 


6.8 


1500 


16.6 


18.5 


1.0 


5.3 


13.0 
Avg. 


7.7 




Avg 


= 1.7 


= 4.1 






a- = 1 .1 




a = 2.8 


750 Hz LPF 


9.9 


12.6 


2.7 


9.7 


14.4 


4.7 


1000 


7.4 


0.4 


2.0 


7.3 


11.6 


4.3 


1250 


6.4 


8.0 


1.6 


5.3 


10.0 


4.7 


1500 


4.4 


6.5 


2.1 


3.7 


X.6 


4.0 


2000 


3.0 


4.3 


1 .3 


0.8 


6.6 


5.8 


2500 


2.4 


2.5 


0.1 


1.4 


4.3 


2.9 


;;()()() 


1.9 


1 .5 


-0.4 


0.2 


3.0 


2.8 


4000 


0.7 


0.4 


-0.3 


-0.5 


0.4 

Avg. 


0.9 




Avg. 


= 1 .1 


= 3.9 






■r = 1.1 




<r = 1.5 



* HPF = high-pass filter; LPF = low-pass filter. Entries designate nominal 
cutoff frequencies of the filters. See Ref. 48 for description of the filters which were 
probably used in the Steinberg tests. 



ELECTRO-ACOUSTIC RATING SYSTEM 



2703 



Table VI — Computed and Observed Results for the Van Wynen 
Linear System Tests (1940) 





Observed 


L, 


- dBt 


Computed 






Reference 






Reference 


Computed 










Circuit 




Reference 


Circuit. 


Minus 


Distorting 


Attenuator 


Test 


Circuit (at 


Attenuator 


( )bserved- 


Network* 


Setting-dB 


Circuit 


Balance) 


Setting-dB 


dB 


2700 Hz LPF 


24.2 


81.4 


SO. 8 


23.6 


-0.6 


1750 Hz LPF 


28.3 


78.3 


76.6 


26.6 


-1.7 


1000 Hz LPF 


31 .3 


74.1 


73.4 


30.6 


-0.7 


250 Hz HPF 


25.6 


79.1 


79.2 


25.7 


0.1 


550 Hz HPF 


30. 8 


73.1 


74.0 


31 .7 


0.9 


1000 Hz HPF 


36.4 


67.6 


68.1 


36.9 


0.5 


250-2700 Hz BPF 


29.0 


75.0 


76.0 


30.0 


1 .0 


550-1750 Hz BPF 


38.9 


62.5 


65.6 


42.0 


3.1 


1000 Hz Peak 












15 dB 


26.8 


SO. 2 


78.1 


24.7 


-2.1 


30 


24.2 


81 .9 


SO.S 


23 . 1 


-1 .1 


56 


34.5 


70.3 


70.1 


34.3 


-0.2 


700 Hz Peak 












30 dB 


25. s 


SI .4 


79.0 


23.4 


-2.4 


2000 Hz Peak 












30 dB 


28.7 


82.8 


76.2 


22 . 1 


-6.6 


Falling Loss 


25.6 


79.2 


79.2 


25.6 





Rising Loss 


26.4 


SO. 6 


78.5 


24.3 

Avg 

a 


-2.1 
= -0.8 

= 2.1 



* LPF = low-pass filter, HPF = high-pass filter, BPF = bandpass filter; numbers 
denote nominal filter cutoff frequencies. "Peak" denotes a resonant circuit defined by 
the resonant frequency (minimum loss of circuit occurred at resonance) and a dB 
value indicative of the Q, e.g., 56 dB indicates a high Q, 15 dB indicates a low Q. 
"Falling Los-;" designates a network whose loss decreased monotonically with in- 
creasing frequency while for the "Rising Loss" network, the loss increases mono- 
tonically with frequency. 

reported in the literature.! The transducers were interconnected via 
one of two channels, the test channel into which various test networks 
were introduced and a reference channel which was essentially dis- 
tortionless. Switching between channels was controlled by the ob- 
servers who, for each network tested, adjusted an attenuator in the 
reference channel to produce equal loudness. Twelve talker-listener 
pairs participated in the tests. 

Computed and observed attenuator settings are given in Table VI. 
The computed setting was obtained by noting the difference in L a 
for the test circuit and the reference circuit at balance, and modifying 
the observed setting by this amount." 



* The high- and low-pass filter results, expressed in different terms, were used 
in deriving the speech loudness computational method. See Section 3.3.4.1 and 
Fig. 10. 
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The agreement between observed and computed reference circuit 
settings is quite good, the error being less than 2 dB in ten of the 
fifteen cases. Those cases showing the larger errors represent severe 
distortion conditions which would be approached rarely, if ever, in 
a well-engineered telephone plant. 

3.5.3 Van Wynen Telephone Set Tests {1940) 

In 1940, K. G. Van Wynen conducted tests in which different types 
of telephone sets were compared on a loudness basis. (These tests 
were not reported in the literature.) The laboratory system used con- 
sisted essentially of two separate acoustic-to-acoustic channels. One 
of these was equipped with reference transmitting and receiving 
telephone sets interconnected by a test network which included an 
adjustable attenuator. (The telephone sets were of the type described 
in Ref. 16.) The other channel used each of two different types of 
telephone sets, designated Test Telephone Sets A and B, with fixed 
connecting circuitry. (Test Telephone Sets A were of the type de- 
scribed in Ref. 15; Test Telephone Sets B were of a design similar to 
the telephone sets used with the Working Reference System described 
in Section 2.2 of this paper.) These sets differed from each other and 
from the reference telephone set in that they had transducers exhibit- 
ing different frequency response characteristics. 

Results of the tests are expressed in terms of trunk loss, variable 
and controlled by the observer, required in the reference channel to 
deliver speech equal in loudness to that from the test channel with 
a fixed trunk loss. (Twelve talker-listener pairs participated in the 
tests.) Computed and observed trunk losses shown in Table VII are 
in close agreement. The computed trunk loss was obtained by noting 
the difference in L 8 for the test channel and the reference channel, 
and modifying the observed trunk loss by this difference. 

3.5.4 Van Wynen Tie Line Tests (1940) 

In 1940, K. G. Van Wynen conducted loudness balance tests with 
laboratory systems which simulated selected circuit conditions char- 
acteristic of telephone connections between Bell Telephone Labora- 
tories and the American Telephone and Telegraph Company. These 
connections involved circuits which were called "tie lines," and thus 
the designation "tie line tests." 
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Two test systems, designated "overall" and "sidetone," were used.* 
Each of these permitted comparison between test networks and either 
of two reference networks. (The transmitter and receiver used in these 
tests were of a design described in Ref. 17.) The observers' task was 
to adjust an attenuator in the reference channel so that speech via 
this channel and the test channel were equally loud. Twelve talker- 
listener pairs participated in the tests. 

Results of the tests expressed in terms of reference circuit settings 
are shown in Table VIII. Corresponding values of L s (dBt) are given 
as parenthetical entries; those in the "Observed" columns apply for 
the reference channel while those in the "Computed" columns apply 
for the test channel. Computed settings were obtained by noting the 
difference between L 8 for the test channel and L„ for the reference 
channel, and modifying the observed circuit setting by this difference. 
Computed and observed values are in close agreement. 

3.5.5 Fraser Tests (1946) 

In 1946, J. M. Fraser conducted tests in which two different 
acoustic-to-acoustic channels were compared on a loudness basis. 
(These tests were not reported in the literature.) The test channel 
comprised test telephone sets of the type which were the Bell System 
Standard at that time, 10 ' 17 and connecting circuitry which simulated 
selected telephone connections. These simulations, arbitrarily desig- 
nated in column 1 of Table IX, exhibited different amounts of increas- 
ing loss with increasing frequency. The reference channel was the 
Master Reference System discussed in Section 2.1 of this paper. 

Results of the tests are expressed in terms of the Master Reference 
System trunk (600-ohm attenuator) setting, variable and controlled 
by the observer, required to achieve loudness balance between the 
test channel and the reference channel. Twelve talker-listener pairs 
participated in the tests. Computed and observed reference trunk 
settings shown in Table IX are in close agreement. 

3.5.6 CCIF Tests (1939) 

In 1939, the CCIF T conducted loudness balance tests on selected 
simulated telephone connections sent them by the American Telephone 



* These designations reflect the frequency response characteristics simulated 
and not testing conditions. Specifically, those with the "sidetone" system were 
on a listening only basis, and did not involve a talker listening to himself via 
the system. Sidetone is discussed in Refs. 33, 49, and 50. 

t See Section 2.1 of this paper. 
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Table IX — Computed and Observed Loudness Levels for the 
Test and Reference Channels of the Fraser Tests (1946) 





Observed 






Computed 






Reference 


L. - 


- dBt 


Reference 


Computed 




Trunk 
Setting 






Trunk 
Setting 


Minus 
Observed 


Test 


Test 


Reference 


Network 


(dB) 


System 


System 


(dB) 


(dB) 


A 


33.3 


67.5 


69.2 


35 


1.7 


B 


39.9 


63.3 


62.8 


39.4 


-0.5 


C 


41.5 


60.8 


61.2 


41.9 


0.4 


D 


43.0 


58.8 


58.9 


43.1 


0.1 


E 


40.3 


61.6 


61.9 


40.6 


0.3 


F 


43.8 


59.7 


58.4 


42.5 


-1.3 


G 


40.6 


60.9 


61.6 


41.3 

A 


0.7 




vg. =0.2 












a = 0.9 



and Telegraph Company. (These tests were not reported in the litera- 
ture.) These simulations were a subset of those discussed in Section 
3.5.4 of this paper. 

The CCIF tests involved two steps. First, the simulated connection 
designated "DL Reference" with 20 dB of added loss and the SFERT* 
were loudness balanced. Twenty talker-listener pairs, formed from a 
trained test crew of five persons, each provided seven balances. Results 
comprised settings (in dB) of the SFERT trunk (600-ohm attenuator) 
required to achieve balance; the average value was 30.9 dB.* 

The L 8 for the DL Reference (with 20 dB added loss) was com- 
puted using frequency response characteristics for the DL Reference 
of Section 3.5.4 while the L s for the SFERT (with an attenuator 
setting of 30.9 dB) was computed from response characteristics for 
the reference system of Section 3.5.5. The respective values were 74 
dBt and 71.6 dBt. That is, the CCIF loudness balance tests showed 
these circuits to be equal while the speech loudness computational 
method indicates that they differ by 2.4 dB. Probably this difference 
is due to the system response characteristics (primarily transducers) 
used in the computations, and not to the computational method itself 
being in error. In the computations, corrections reflecting differences 
in specific transducers of the same type were applied where known, 



* See Section 2.1 of this paper. 

t As noted in Section 2.1, the SFERT line attenuator setting (in dB) re- 
quired for loudness balance with a given system under test is, by definition, the 
Reference Equivalent (in dB) of that system. Thus, the Reference Equivalent 
of the 20 DL Reference connection is 30.9 dB. 
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but these were not known in all cases, and average responses were 
necessarily used in these cases. This argument is supported to some 
extent by the internal consistency for the tests of Section 3.5.4, the 
tests of Section 3.5.5, and the CCIF tests. 

The DL Reference was then used as a secondary reference against 
which ten simulated telephone connections were loudness balanced. 
Computed and observed results given in Table X are in close agree- 
ment. 

3.5.7 Comparison of Computed Loudness Losses and Reference Equiv- 
alents 

Reference Equivalents are, by definition, loudness ratings of tele- 
phone connections obtained using the SFERT (and, by inference, the 
MRS and NOSFER). The speech loudness computational method 
provides loudness ratings in orthotelephonic terms. Since both are 
based on speech loudness, information from the preceding sections 
can be used to derive a correction factor which would permit trans- 
lating ratings between the two frames of reference. (Such translation 
depends on bandwidth characteristics of telephone connections and 
therefore will depend on, among other things, the particular trans- 
ducer types involved.) Tables XI and XII summarize the appropriate 
information. 

The average conversion factor from Table XI is 12.7 dB while that 
from Table XII is 15.0 dB, a difference of 2.3 dB. Some possible 
reasons for this difference were discussed in Section 3.5.6. In addition, 
the observer groups differed for the Fraser and CCIF tests, and there 
were probably some minor differences in test procedure. 

IV. LABORATORY MEASURING SYSTEM 

The speech loudness computational procedure could be realized as 
a laboratory measuring system in any one of several forms. The 
"ideal" system would implement exactly the various functions of the 
computational procedure. The system sound source would energize 
the transmit end of a telephone connection with an acoustic signal 
having the amplitude characteristic of Fig. 2 and a time rate-of- 
change based on Fig. 5. The signal received at the other end of the 
connection would be applied to a measuring subsystem containing a 
network with a response according to Fig. 4. System operation so far 
corresponds to computing the effective spectrum, Z 8 . (See column 7 
of the computation form on Fig. 14.) 
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Table XI — Relation Between Reference Equivalents (RE) and 
Loudness Loss (LL) from the Fraser Tests (Section 3.5.5) 



Test 


RE* 


L* 


LLt 


RE Minus 


System 


(dB) 


(dBt) 


(dB) 


LL (dB) 


A 


33.3 


67.5 


21.9 


11.4 


B 


39.9 


63.3 


26.1 


13.8 


C 


41.5 


60.8 


28.6 


12.7 


D 


43.0 


58.8 


30.6 


12.4 


E 


40.3 


61.6 


27.8 


12.5 


F 


43.8 


59.7 


29.7 


14.1 


G 


40.6 


60.9 


2S.5 


12.1 








Avg. =12.7 








<r = 0.9 



* Columns 2 and 3 of Table IX. 

+ 89.4 minus entries of column 3. See last paragraph of Section 3.4.2. 



The remainder of the measuring subsystem would consist of a device 
to reflect the spread-of-loudness effect, a device having the law of 
Fig. 6, an integrating circuit, and, finally, a display mechanism re- 
flecting the law of Fig. 8. 

The "ideal" system could be modified, without changing its essen- 



Table XII — Relation Between Reference Equivalents (RE) and 
Loudness Loss (LL) from CCIF Tests (Section 3.5.6) 



Test 


RE* 


L s t 


LLt 


RE Minus 


System 


(dB) 


(dBt) 


(dB) 


LL (dB) 


20-DL REF 


30.9 


74 


15.4 


15.5 


20-1900 LPF 


34.4 


68.3 


21.1 


13.3 


20-2400 LPF 


32.6 


71.3 


18.1 


14.5 


20-3000 LPF 


31.7 


73.2 


16.2 


15.5 


20-RL2 


33.2 


71.1 


18.3 


14.9 


20-RL3 


33.1 


70.7 


18.7 


14.4 


REF DL 


24.4 


82.8 


6.6 


17.8 


-10 DL 


34.6 


69.8 


19.6 


15.0 


-20 DL 


42.7 


61.5 


27.9 


14.8 


R900 


29.9 


75.2 


14.2 


15.7 


R3200-A 


33.8 


69.5 


19.9 


13.9 








Avg. = 15.0 








a = 1.1 



* Derived from column 3 entries of Table X and relation that DL Reference with 
20-dB trunk setting had a reference equivalent of 30.9 dB. 

t From column 4 of Table X and text of Section 3.5.6. 

* 89.4 minus entries of column 3. See last paragraph of Section 3.4.2. 
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tial attributes, by combining the "X" network response with that of 
the signal source. The source output would then have an amplitude 
characteristic as shown by the dash-dot curve of Fig. 9. (See column 
2 of the computation form on Fig. 14.) 

Realization of the system design outlined above is limited by a 
number of practical considerations. Exact implementation would re- 
quire using an artificial mouth and an artificial ear which duplicated 
their human counterparts within the context of definitions given in 
Fig. 3. Such are not available. However, simpler devices are available 
which are probably adequate for purposes of many telephone engi- 
neering problems. 

Secondly, the source signal does not incorporate the dynamic 
properties of speech. This is probably not important as long as the 
telephone connection under test is composed of linear elements. How- 
ever, measurements on connections incorporating nonlinear elements, 
particularly carbon transmitters, may be in error. Carbon transmitters 
in real use are activated by real speech; testing such with an applied 
signal of the type referred to above might well result in different 
device operating characteristics. This matter is under study. 

In this paper, we consider a system which is somewhat simpler than 
the "ideal" and which probably reflects adequate accuracy for tele- 
phone rating purposes. This system, derived from the computational 
procedure, is designated the EARS (for E'lectro-^coustic .Rating 
System) , and is dealt with in ensuing sections.* A graphical compu- 
tation procedure is also described. Finally, a comparison is made 
between ratings based on the EARS and ratings obtained from sub- 
jective tests. 

The EARS utilizes an artificial voice and a 6-cm 3 coupler while the 
computational procedure utilizes the concept of orthotelephonic trans- 
mission. Response definitions appropriate to artificial mouth and 6-cm 3 
coupler measurements are shown on Fig. 16 in terms similar to the 
orthotelephonic response definitions of Fig. 3. 

Typical artificial voice/6-cm a coupler amplitude responses of con- 
nections employing specific telephone set types are given in Ref. 18. 
As with orthotelephonic responses, artificial voice/6-cm 3 coupler re- 
sponses vary widely depending on telephone set and transducer types 
used. Conversion curves for translating between orthotelephonic and 
artificial voice/6-cm 3 coupler responses are not as variable, but are still 
highly dependent on telephone set and transducer type. 

* Reference 51 describes derivation of a measuring system which is similar 
to the EARS. 
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GAUGED POSITION RELATIVE TO THE ARTIFICIAL MOUTH LIP RING 

P 2a = PRESSURE DEVELOPED BY TELEPHONE RECEIVER IN A 6cm 3 COUPLER 

V 2 = VOLTAGE ACROSS THE RECEIVER 

Fig. 16 — Telephone circuit responses in artificial voice — 6-cm 3 coupler terms 



GENERAL NOTE: 

SEE TEXT FOR 
DISCUSSION OF 
ARTIFICIAL MOUTH, 
INCLUDING 
SPECIFICATION OF 
LIP RING PRESSURE. 



4.1 Derivation of EARS 

We begin by considering the computation form of Fig. 14. For the 
reference effective spectrum (column 2), we compute the cumulative 
contribution to loudness as a function of computation band, i.e., we 
assume an orthotelephonically flat transmission system with dB 
of gain for the sum of columns 3, 4, and 6, and compute the entries 
for column 10. Cumulating these and dividing each cumulative entry 
by the total loudness (N 8 ) results in a relation between cumulative 
percentage contribution to loudness and frequency. This relation is 
shown by the small circles plotted on Fig. 17; the ordinate is cumula- 
tive and the abscissa is upper frequency limit of the computation 
band, obtained from Fig. 5. The relation of Fig. 17 differs from that 
of Fig. 5 because the latter is for a flat effective spectrum (Z„ = a 
constant) whereas the former is for the effective spectrum of speech 
(Z s = B m — X) shown by the dot-dash curve of Fig. 9. 

The points plotted on Fig. 17 can be reasonably approximated by 
a straight line drawn from 100 Hz to 5000 Hz on the logarithmic 
frequency scale. Thus, 2-percent loudness bands, or any bands of 
equal loudness derived from this curve, have a logarithmic relation 
to frequency and may be found by laying off equal lengths along the 
logarithmic frequency scale. 

In terms of a measuring system, the above suggests using a source 
signal which has a flat amplitude versus frequency characteristic but 
which sweeps across the band at a logarithmic rate. Such a signal 
would cover equal loudness bands of the effective spectrum (5 9( , — X) 
in equal time divisions, corresponding to equal length divisions along 
the abscissa of Fig. 17. Integration of the sweep on a time basis 
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600 1 000 2000 

FREQUENCY IN Hz 



4000 6000 10,000 



Fig. 17— Cumulative contribution to loudness of speech for the reference effec- 
tive spectrum (see Figs. 9 and 14). 



would result in a value proportional to the total loudness of the effec- 
tive spectrum since the signal amplitude is constant with frequency. 

In essence, the operation described above translates the amplitude 
weighting of Z 8 = B Qlt — X to frequency weighting. The computation 
form of Fig. 14 could be modified to reflect this by changing columns 
1 and 9 to conform with the straight line of Fig. 17 and entering a 
constant value of Z 8 in column 2. The value of Z s can be determined 
by dividing the computed N a for B wt — X by the number of computa- 
tion bands (e.g., 50) to find n, (the loudness per band of unit im- 
portance), then entering the curve of Fig. 6 at that value of n„ . 

The source signal described above, in combination with the acoustic- 
to-acoustic response of a telephone connection, provides the effective 
spectrum of the received speech for that connection. The next step, 
then, is to apply the loudness scale of Fig. 6 in order to convert to 
loudness units. 

The loudness scale of Fig. 6 is linear on logarithmic coordinates, 
i.e., Z B = k log n a , above the knee of the curve, and the number of 
loudness units increases tenfold for a 44-dB change in effective level, 
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or doubles for a 13.2-dB change in effective level. (This is reasonably 
checked by subjective loudness tests with filters as plotted on Fig. 10. 
In the Steinberg tests, a 9.1-dB reduction in effective level corresponded 
to a 50-percent reduction in loudness, while in Van Wynen tests the 
level reduction required to produce half loudness was 12.4 dB). This 
relation is assumed to hold over the entire range of effective levels. 
The effect of this assumption on accuracy will be considered in a 
later section. 

At the receiving end of a telephone connection activated by the 
logarithmic signal source at the transmitting end, the received acoustic 
signal is applied to a measuring circuit and ultimately appears as a 
voltage across a resistance. At any instant, this voltage is proportional 
to 10 r/2 ° where C is the response, in dB, of the telephone connection at a 
particular frequency. However, because of the sweep rate of the signal 
source, voltage elements appearing across the resistance in time sequence 
are proportional to 10 z,/2 °, where Z, is the effective level of the received 
speech spectrum. We have already postulated that loudness doubles for 
a 13.2-dB change in Z„ ; the voltage elements we would like to see 
across the resistance should therefore be proportional to 10 z ' /4 \* Thus, 
the desired voltage elements, V, , are proportional to the 2.2 root of 
the voltage elements, V Xl , which actually appear, i.e., 

r, cc 2.2 VT7, (9) 

or 

(V,) 8 - 2 cc V ir . (10) 

The above relationship suggests that voltage elements proportional 
to 10 z,/ "° be applied to the input of a 2.2-to-l compressor. With such 
a device, changes at the input proportional to 

i(j<Z.i-Z.j/20> 

appear at the output proportional to 

10 <*..-W44, 



* If loudness N 2 is twice as huge as loudness N t , 

f!2 = lO^-z.,'*) = 2, 
Nt 

Z.2 - Z, x = x log, o 2 = 13.2, 
and 

i = 44. 
Thus 

A' * 10*. ' 44 . 
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which is the 2.2 root of the input change. Thus, the action of the com- 
pressor approximates the loudness scale of Fig. 6 in converting elements 
of the received effective spectrum to loudness units. 

Integration of the voltage with respect to time may be either on a 
square-law or linear basis. Without the compressor, square-law integra- 
tion adds voltage elements proportional to l(f ,/10 while linear integra- 
tion adds voltage elements proportional to 10 z,/2 °. By including the 
compressor with linear integration, elements proportional to 10 z,/44 are 
added, these in turn being proportional to loudness units. The com- 
bination of the two gives, in effect, 2.2 root law addition. 

System operation to this point corresponds to use of the computa- 
tion form of Fig. 14 in that the integrated voltage, V, , is proportional 
to N s . A means of expressing this voltage in dB is needed for the 
same reasons that the curve of Fig. 13 was derived to permit inter- 
pretation of N a . This can be accomplished using an indicating meter 
with a dB scale. By making the meter circuit an "averaging" one 
with a time constant long compared to the sweep time of the source 
signal, deflections of the needle will be approximately proportional to 
the total number of loudness units. A dB scale corresponding to the 
conversion specified by the curve of Fig. 13 could be provided. This 
might be difficult since the constant of proportionality of the needle 
deflections to total number of loudness units is not readily obtainable. 
Instead, advantage was taken of another approximation. 

Comparison of the curves of Figs. 6 and 13 shows that above their 
respective knees the shapes of these curves are the same. If we assume 
a straight-line relationship between L„ and N s as was done earlier for 
the relation between n a and Z g , the addition of distortionless attenu- 
ation in any speech spectrum will produce a dB loudness level change 
equal to the change in attenuation. Thus, the meter scale can be so 
calibrated that it obeys the same law of addition as that provided 
by the compressor and linear integrator in converting elements of 
effective level to loudness, namely a 13.2-dB change in effective level 
would correspond to doubling the total number of loudness units. The 
scale on the meter would show a difference of 13.2 dB between half- 
scale and full-scale deflection of the needle, or between any two de- 
flection points, the greater of which is twice the smaller. Conse- 
quently, the meter reading would reflect a change in distortionless 
loss in the telephone connection under test, dB for dB. 
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4.2 Description o] EARS 

A block diagram of EARS is shown on Fig. 18. The system con- 
sists of two parts: a signal source and a measuring subsystem. The 
latter includes provision for measuring both voltage and pressure. 3 

4.2.1 Signal Source 

The logarithmic oscillator provides at its output terminals a signal 
which sweeps logarithmically with time from 300 Hz to 3300 Hz to 
300 Hz and has a flat amplitude versus frequency characteristic. The 
reason for limiting the measuring band of the EARS to 300-3300 Hz 
is a practical one. The use of partial connection ratings as an engi- 
neering tool implicitly requires that, for any given connection, the 
sum of the partial ratings should approximately equal the overall 
rating. Thus, the bandwidth used to obtain these ratings should ap- 
proximate the bandwidth of the most restrictive element (s) in the 
connection in order to avoid cumulating bandwidth penalties when 
summing partial ratings. The specific limits of 300 Hz and 3300 Hz 
were selected by reviewing bandwidth characteristics of various 
telephone equipments and facilities. 

The oscillator sweep rate is 6 times per second where a sweep is 
defined in terms of the 300-3300-Hz band. Criteria leading to selec- 
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Fig. 18— Block diagram of the EARS. 
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tion of this sweep rate were (i) the sweep interval (1/sweep rate) 
should be small as compared to an easily realizable integrating time 
constant for the indicating meter and (n) the rate should approximate 
the syllabic rate of speech. The latter characteristic appears desir- 
able when measuring carbon transmitters in order to ensure that 
these operated at an efficiency comparable to that obtained under 
actual use conditions, i.e., with real speech applied. (Recent measure- 
ments indicate that the sweep rate can be changed over the range 
2 to 10 sweeps per second without significantly changing the ratings 
of many telephone sets of modern design.) 

The conditioning network (a 6-dB attenuator) is momentarily 
switched out of the source circuit prior to measuring carbon trans- 
mitters thus increasing the source signal level by 6 dB. The higher 
level is intended to condition the transmitters to operate at the proper 
level. Conditioning is not required when measuring linear transmitters. 

The artificial mouth equalizer comprises a passive network whose 
frequency response is the inverse of the electric-to-acoustic response 
of the artificial mouth. The loss of this network is compensated for 
by the amplifier. 

The artificial mouth used is a permanent magnet, moving coil loud- 
speaking unit, and is, for all practical purposes, the equivalent of an 
earlier proposal. 52 The mouth includes, as an integral part, a lip ring 
which is used as a reference for obtaining the proper spatial relation- 
ship between the artificial mouth and telephone instruments under 
test. The location of the lip ring has been empirically determined so 
as to correspond approximately to the plane of the lips of a human 
mouth. 52 

Ideally, the source arrangement should, with an oscillator output 
which is flat with frequency over the band of interest, provide an 
output pressure at the artificial mouth lip ring which is also flat with 
frequency.* Practically, control of the overall response to within 
±1 dB of the 1000-Hz value provides acceptable operation, intro- 
ducing less than about 0.2 dB error in ratings. 

4.2.2 Measuring Subsystem 

The measuring subsystem is arranged to permit both voltage and 
pressure measurements. For voltage measurements, the input is a 



* The pressure is measured with a Type L microphone. 53 The microphone is 
located in a carefully gauged position so selected that the pressure measured at 
that point corresponds closely to the pressure at the center of the lip ring 
opening. 54 
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high-impedance transformer, and is bridged across the selected im- 
pedance (usually a 900-ohm resistor) terminating the telephone con- 
nection at the point where voltage measurement is desired. The trans- 
former is connected to an attenuator, used in system calibration to 
compensate for gain drift of amplifiers, thence to a switch node. 

The pressure measuring circuit consists of a G-cm 3 test coupler 
equipped with a Type L pressure microphone.* The microphone is con- 
nected to a condenser microphone amplifier which provides bias voltage 
to the microphone and which produces at its output a voltage propor- 
tional to the pressure developed in the 6-cm 3 cavity by the telephone 
receiver under test. The amplifier is connected through a calibrating 
attenuator to a switch node. 

The measurement mode is selected by operating the switch to 
"pressure measurement" or "voltage measurement." The switch swinger 
connects to an attenuator (designated Rating Control), amplifier, 
compressor, and vacuum tube voltmeter. The compressor design em- 
ployed provides a 2.2-to-l characteristic over a limited range. Opera- 
tion is confined to this range by holding the compressor output voltage 
at a constant value, indicated on the vacuum tube voltmeter as the 
"reference" or "zero" point, and adjusting for rating changes using 
the Rating Control. This takes advantage of the fact that, as pre- 
viously noted, flat attenuation changes ahead of the compressor equate, 
on a dB-for-dB basis, to loudness level changes. 

4.2.3 System Calibration 

System calibration consists of first removing the condenser micro- 
phone from the coupler and locating it at gauged position relative to 
the artificial mouth lip ring, then adjusting the source to deliver 
reference test pressure at that point. This adjustment is based on the 
condenser microphone calibration, and does not involve the EARS 
measuring subsystem. When the proper test pressure has been ob- 
tained, the EARS measuring system is switched to the pressure mea- 
suring mode, Rating Control is set at the "0" dB, and the "Pressure 
Adjust Attenuator" set to obtain reference reading on the vacuum 
tube voltmeter. Since the pressure spectrum being measured is flat 
with frequency, pressure level equates to loudness pressure level. 

Calibration of the voltage measuring mode proceeds in a similar 



* The simple 6-cm :t test coupler used, conforming in design to present standards 
(Fig. 3 of Ref. 55), has been found suitable for measurements of the type con- 
sidered in this report. However, handsets with ear caps of unusual shape may 
require a different coupler configuration. 
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manner. A voltage signal, derived from the log oscillator, is applied 
to a 900-ohm resistor. The voltage developed is first read directly with 
a voltmeter. Then the high-impedance bridging connection of the 
EARS measuring subsystem is connected across the resistor, the sys- 
tem switched to the voltage measuring mode, and the Rating Control 
set to read the voltage measured with the voltmeter above, i.e., if 
the latter was — 2 dB relative to 1 volt, then the rating control is set 
to read —2 dB relative to the "reference" or "zero" setting. The 
"Voltage Zero Adjust" is then set to obtain "reference" or "zero" 
reading on the measuring subsystem vacuum tube voltmeter. Since 
the voltage spectrum being measured is flat with frequency, voltage 
level equates to loudness voltage level. 

4.2.4 Rating Measurements 

Ratings of partial or overall telephone connections are established 
by (i) the reading of the Rating Control and (ii) the reference pres- 
sure and/or voltage levels employed. Examples of rating measure- 
ments, including signal levels employed with the present EARS, are 
given on Figs. 19, 20, and 21. The transmitting and receiving loops 
of Figs. 19 and 20 respectively are the same as those for the overall 
connection of Fig. 21. Sum of the component ratings indicates a 
loudness loss of 17.3 dB (= -18.3 + 25.6 + 10) while the overall 
rating from Fig. 16 is 17.1, a discrepancy of 0.2 dB. Had the mea- 
surement band been increased from 300-3300 Hz to, for example, 100- 
5000 Hz, the discrepancy would have been somewhat larger because of 
the bandpass response of both the transmitter and receiver. 
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Fig. 19 — Transmitting loop rating. 
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Fig. 20 — Receiving loop rating. 

4.3 Graphical Computation of Loudness 

A form for graphical determination of loudness ratings from the 
amplitude response characteristics of telephone connections is shown 
on Fig. 22. This form is based on the same considerations as those 
leading to the EARS, and is discussed in this paper because it is the 
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Fig. 22 — Graph paper for computing loudness ratings. 

vehicle used to study the effects of differences between the loudness 
computation method and the EARS. 

The lower abscissa scale is frequency in Hz, corresponding on a 
logarithmic basis to the upper abscissa scale in inches. (Use of inches 
is arbitrary; any length units could be used.) Thus, equal increments 
on the upper scale correspond to equal increments of log / 2 //i where 
fa > /1 • (This reflects the straight-line approximation shown on Fig. 
17). The equal distance increments closely approximate bands of the 
B w , — X spectrum which are interpreted to be of equal loudness when 
listened to by the human ear. 

The right-hand ordinate scale is proportional to the 2.2 root of a 
voltage, current, or pressure. This scale and grid are constructed 
according to the equation 



44 log 2X = 55 - L 



(11) 
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whore 



X = inches (left-hand ordinate scale) 

L = loss of circuit in dB (right-hand ordinate scale). 

(Equation (11) assumes that the reference input to the connection 
is flat with frequency.) The ordinate is measured in inches from the 
bottom line of the graph which corresponds to zero output voltage, 
current, or pressure. Correspondence between ordinate scales is dem- 
onstrated in Table XIII. Boxes at the top of the sheet are provided 
for recording graph measurements. 

In order to use the graph paper, the loss (in dB) of the partial 
or overall telephone connection must be known over the band of 
interest. This loss data may be in terms of 

input pressure input voltage input pressure (millibars) 

output pressure output voltage output voltage (volts) 

or 

input voltage (volts) 
output pressure (millibars) 

The loss frequency characteristic is plotted on the graph paper. 
The right-hand ordinate scale may be adjusted by a constant for 
negative losses, i.e., gains. Where large losses are encountered, greater 
accuracy can be obtained by similarly applying an adjustment con- 
stant. To illustrate, if the lowest loss across the band of interest is 
— 15 dB, the values along the right-hand ordinate scale should have 
15 subtracted from them. If, on the other hand, the lowest loss is +15 
dB, the right-hand ordinate values should be increased by 15. 

The average height of the response as plotted on the graph paper 
is determined by measuring the area (in square inches) under the 
curve and dividing by the base width (in inches). The average height 
is then located on the left-hand scale and the corresponding dB value 
read from the right-hand scale. This dB value is the rating based on 
reference input level. 

For cases where the input spectrum is not flat with frequency, the 
graphical computation of a circuit rating involves determining two 
areas. The first of these is obtained from a plot of the actual input 
spectrum on the graph paper, the second from a plot of the input 

* See Ref. 3. 
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Table XIII — Relation Between Ordinate Distance and Loss 

Scales 



Left-Hand Ordinate 
Scale-Distance (Inches) 


Input/Output 


Right-Hand Ordinate 
Scale-Loss (dB) 




0.5 
5 
8.89 


561 
3.55 
1 


CD 

55 
11 





spectrum modified by the circuit response. The loudness rating is 
then the dB difference between the losses computed for these areas. 

v. comparison of graphically determined ratings and observed 

RESULTS 

The speech loudness computation method accurately predicts loud- 
ness performance of telephone connections (see Section 3.5) . Labora- 
tory realization of this method, the EARS, involved a number of 
simplifying assumptions. Thus, validation of the EARS approach re- 
quires considering (i) the accuracy of the EARS in predicting loud- 
ness performance and (ii) the effects of the various simplifying as- 
sumptions. We do this by comparing computed ratings and observed 
results for the subjective tests of Section 3.5. (Numerous other tests 
have been performed which support the EARS concept, but these are 
not considered because either the tests were limited, i.e., incomplete 
test designs, small observer groups, etc., or the test system amplitude 
response characteristics are not known.) 



Table XIV — Features of the Speech Loudness Computation 
Method and the EARS Method 





Computation 


EARS 


Feature 


Method 


Method 


Loudness Law 


Figs. 6 and 8 


Linear Portions of 
Figs. 6 and 8 


Spread-of -L< >ud ness 






Correction 


Yes 


No 


Analysis Bandwidth 


100-5000 Hz 


300-3300 Hz 


Reference Bandwidth 


250-4000 Hz 


300-3300 Hz 


Rating Definition 


Orthotelephonic 


Artificial Yoice/6-cm 3 




Terms 


Coupler Terms 




(Fig. 3) 


(Fig. 16) 
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Validation of the EARS utilized the graphical computation method 
described in Section 4.3. This approach was necessary because the 
systems used in the tests were not available for direct measurement 
using the EARS. However, the frequency response characteristics of 
these systems were known and, thus, their loudness ratings could be 
computed. 

In theory, the graphical form (Fig. 22) and the computation form 
(Fig. 14) should provide about the same results. The reference of the 
former (the horizontal line at dB of loss) corresponds to that of 
the latter (entries of column 2) because of the relation between fre- 
quency weightings of the two references. There is, however, one 
important difference between the two forms. For the graphical form, 
the loudness versus effective level relation has a constant slope 
(straight-line portion of Fig. 6) while for the computation form, this 
relation has a pronounced change in slope at low effective levels (see 
Fig. 6). The effects of this difference should be most apparent at low 
received speech levels where the graphical method would indicate a 
higher received speech loudness than would the computational method. 

With the above in mind, we can now consider the ratings for the 
various tests under a number of different conditions. These conditions, 
listed in Table XIV, reflect the differences in features of the speech 
loudness computation method and the EARS. 

Computed and observed ratings are given in Tables XV through 
XIX in terms of loudness loss (positive entries) or loudness gain 
(negative entries). The arrows and associated numbers below the 
tabular entries refer to distributions of differences between the various 
conditions considered. 

We will consider Table XV in some detail, noting that since the 
other tables have an identical arrangement, such discussion will in 
general similarly apply to these tables. Referring to Table XV, 
column 1 designates the network tested, and is repeated from Table 
V. Column 2 designates the setting of the reference network for which 
the test observers judged the test and reference networks to provide 
equally loud speech. Columns 3 through 9 each contain 3 subcolumns; 
"a" and "b" are the loudness ratings of the test and reference chan- 
nels respectively and "c" represents the error, equal to "a" entries 
minus "b" entries. 

Column 3 entries are essentially a repeat of the information con- 
tained in Table V, obtained by converting the Table V entries to L„ 
values, then subtracting these from 89.4 dBt, the level of the refer- 
ence speech spectrum of Fig. 2 transmitted over a system that is flat 
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on an orthotelephonic basis, is sharply bandlimited to 250-4000 Hz, 
and has (IB of loss within this band. For example, test network 
125 Hz HPF (sensation level = 100 dB) of Table V has an L 8 = 
107.1 - 0.2 = 106.9 clBt. The loudness loss is then 89.4 - 106.9 = 
-17.5 dB. 

Entries in columns 4 through 9 were computed using the graphical 
form of Fig. 22. Column 4 entries are essentially a repeat of the 
column 3 entries, the difference being, as noted earlier, that the 
former were computed using a linear loudness law while the latter 
were computed using the loudness law of Figs. 6 and 8. Column 5 
entries were computed without applying spread-of-loudness correc- 
tions. 

The reference spectrum for computing the values given in columns 
3, 4, and 5 was bandlimited to 250-3000 Hz. For example, entries 
of column 5 were obtained by plotting each of the responses on the 
form of Fig. 22, then measuring the area (square inches) enclosed 
by this response curve, the base line (designated inches) , and the 
left-hand and right-hand boundaries (designated 100 Hz and 5000 
Hz respectively!. This area was then divided by the base (inches) 
corresponding to the 250-4000-Hz bandwidth, and the equivalent 
height (inches) converted to loudness rating (dB) using the left-hand 

and right-hand ordinate scnlos. Entries of column 6 were obtained in 
the same way as those of column 5 except that the area was divided 
by the base (inches) corresponding to a 100-5000-Hz bandwidth. 
Thus, the change between columns 5 and 6 was simply one of refer- 
ence spectrum bandwidth. 

Entries of column 7 were obtained in the same manner as those of 
column G except that the area measured was restricted to the 300- 
3300-Hz bandwidth characteristic of EARS, and the reference spectrum 
was limited to the 300-3300-Hz band. Entries of columns 8 and 9 
were obtained in the same manner as were entries of columns 6 and 7 
respectively except that artificial voice/6-cm 3 coupler responses were 
used for the former, orthotelephonic responses for the latter. 

The entries of columns 9a and 9b were computed in a manner which 
reflects the essential features of the EARS and, therefore, these en- 
tries closely approximate (within the bounds of computational and 
measurement error) what would be measured with the EARS on those 
test and reference systems utilizing linear transducers. However, 
column 9a and 9b entries of Tables XVII and XVIII and column 9a 
entries of Table XIX do not necessarily represent what would be 
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measured with the EARS. The reason for this is that the systems for 
these cases utilized carbon transmitters which are nonlinear devices. 
Transmitter responses used in the computations, based on measure- 
ments made with real speech, would probably differ from the re- 
sponses pertaining during an EARS measurement because of the 
highly different nature of speech and the EARS acoustic test signal. 
This matter is now under study. 

The error distributions corresponding to each of the seven compu- 
tation methods exemplified by the column 3 through 9 entries are 
summarized in Table XX. These distributions reflect the accuracy 
of the computed ratings in predicting reference channel setting for 
equal speech loudness. The entries of column 1 are repeated from 
Table IV. These show that the computational method provides an 
accurate means of predicting subjective loudness balances for all of 
the tests except the Steinberg linear system tests at a sensation level 
= 39 dB. As noted earlier, received speech levels in a well-engineered 
communications system will seldom be this low, and if they do occur, 
are likely to represent trouble conditions. 

Comparing the entries of column 2 to those of column 1 , we note 
that there is little change in the error distributions, indicating that 
the graphical method is a close approximation of the computational 
method. Exceptions to this are the results for the Steinberg tests, par- 
ticularly at the lower sensation level where the average error and 
error standard deviation are somewhat greater than when using the 
computation form. The reason for this is the change from the loud- 
ness law of Figs. 6 and 8 to a linear loudness law. 

Column 3 entries indicate a further reduction in accuracy due to 
neglecting spread-of-loudness. The greatest change occurs for the 
Steinberg tests which involved numerous filter conditions and the 
Van Wynen telephone set tests which involved transducers with pro- 
nounced resonances. Both of these represent instances where spread- 
of-loudness effects would be important. Somewhat less change occurs 
for the Van Wynen linear system tests which also involved filter 
conditions. 

The remaining columns show errors which might be expected using 
the EARS as presently defined (column 7), and the EARS modified to 
incorporate a wider measuring band (column 6). For the most part, 
extending the band improves the accuracy appreciably with the ex- 
ception of the Steinberg tests and the Van Wynen telephone set tests. 
As regards the former, examination of Table XV indicates that extend- 
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ing the band substantially improves the accuracy for the low-pass 
filter conditions but decreases the accuracy for the high-pass filter 
conditions. 

The entries of columns 4 and 5, in a sense counterparts of columns 
6 and 7 respectively, indicate what might happen if an EARS were 
built to utilize an artificial voice and an artificial ear which were 
accurate simulations of the human voice and ear within the context 
of the orthotelephonic definitions (see Fig. 3), and the system were 
calibrated in conformance with these definitions. Comparing columns 
4 and 6, and 5 and 7, we see that the accuracy improves for the 
Van Wynen telephone set tests and the Fraser tests, remaining about 
the same for the other tests. 

Let us now direct our attention to the difference distributions of 
Tables XV through XIX. These are given at the bottom of the tables 
together with arrows showing the computational methods compared. 
To obtain the difference distributions, differences were obtained be- 
tween numerical entries, condition by condition. The first number 
associated with an arrow represents the average difference, a positive 
number signifying that ratings entered in columns at the right-hand 
tip of the arrow are numerically larger than those at the left-hand tip 
of the arrow. The second number, if given, represents the standard 
deviation of the difference distribution. In many cases, a standard 
deviation is not given because it was found to be of the order of 0.1 
dB, insignificant for present purposes. 

The difference distributions referred to above are summarized in 
Table XXI. Note that there is a set of entries for each different 
transmitter-receiver combination excepting the Steinberg tests for 
which entries arc given for each of the two sensation levels. 

Column 1 of Table XXI indicates that the graphical method results 
and computation method results differ very little except, as was noted 
in discussion of Table XX, in the case of the Steinberg tests with the 
sensation level = 39 dB. Differences in graphical ratings with and 
without spread-of-loudness included are also seen to be relatively 
small from entries of column 2. Entries of column 3 show that the 
change in ratings due to changing the reference spectrum bandwidth 
from 250-4000 Hz to 100-5000 Hz is essentially a constant, although 
some slight variation from test to test is apparent. 

The remaining columns, 4 through 7, show that (i) measuring system 
bandwidth differences and (ii) orthotelephonic versus artificial voice/ 
6-cm coupler differences are highly dependent on transmitter-receiver 
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combinations. These differences appear to fall into two categories: 
(i) the Steinberg and Van Wynen linear system tests and the reference 
system of the Fraser tests for all of which linear, but different type, 
transducers were used; (m) the Van Wynen telephone set and tie line 
tests and the test system of the Fraser tests for all of which nonlinear 
transducers were used. The transmitter and receiver types were identical 
(although the specific transducers were different) for the reference 
telephone set of the Van Wynen telephone set tests, the Van Wynen 
tie line tests, and the test system from the Fraser tests. 

That the difference distribution standard deviations are relatively 
small and there seems to be a strong dependence of average difference 
on transmitter-receiver combination, suggests the possibility of apply- 
ing correction factors to ratings computed by one method, e.g., the 
EARS method, to obtain ratings for some other method, e.g., the 
computational method. The average differences summarized in Table 
XXI represent these correction factors. 

Returning to Table XX, we note rather large errors in some cases 
with the EARS method (see column 7) while the errors for the com- 
putation method are rather small (see column 1). Obviously, applica- 
tion of correction factors to the Steinberg, Van Wynen linear system, 
and Van Wynen tie line tests will not improve the EARS method 
error distributions since in each of the tests, the particular trans- 
mitter-receiver combination was common to the test and reference 
channels. 

Different transducers were used in the test and reference channels 
of the Van Wynen telephone set tests and the Fraser tests. Let us 
examine the effect of application of correction factors to column 9 
entries of Tables XVII and XIX on error distributions. The correc- 
tion factors can be obtained from Table XXI. When the appropriate 
entries of column 9 of Tables XVII and XIX are so converted, the 
error distribution for the Van Wynen telephone set tests is changed 
from one with average = 2.5 dB, a = 1.0 dB to average = 0.3 dB, 
o- = 1.5 dB comparing favorably to the column 3 values (Table 
XVII). Similar values for the Fraser tests are average = —6.5 dB, 
a = 1.2 dB before correction, average = dB, a = 1.3 dB after 
correction, comparing favorably to the column 3 values from Table 
XIX for which average = 0.2 dB, a = 0.9 dB. 

Review of preceding discussion and the information contained in 
Tables XV through XXI indicates that the EARS provides a simple 
and reasonably accurate method of measuring telephone connection 
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loudness loss. For a telephone plant utilizing telephone sets of a single 
design, or of several similar designs, loudness ratings determined 
following the EARS procedure can provide an effective tool in telephone 
transmission engineering. In situations where the plant utilizes some- 
what different telephone set designs, it appears that reasonably good 
design can result from using the EARS method to determine perform- 
ance with the individual set designs. Comparisons between designs 
may then be made by application of correction factors. Determination 
of these correction factors can be a time consuming and difficult task 
as is evident from consideration of the orthotelephonic and artificial 
voice/6-cm 3 coupler response definitions of Figs. 3 and 16 respectively, 
but it is worth noting that such correction factors need be determined 
only once for each transmitter-receiver combination mounted in a 
particular handset configuration. 
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